CAL ASSOCIATION 








April, 1953 Vol. 17, No. 2 
CONTENTS 


Rigidity and Shock Therapy of Psychotics: An ant Study: 
Maxwell §. Pullen and Ross Stagner - - - - - Fide Se 


A Specific Relapse Phenomenon During the Course of Electric Convulsive ener 
Roderick W. Pugh - - - - - = + 2©+ +2 2+ = = © = ae 


Psychological Changes Following eaeepeaee on the Human Frontal Lobe: 
Sidney Crown - - - - - “ ° PN 6 Oa “a 


Generalization of Problem- ne caged pump L. Cowen, Morton Wiener, 
and Judith Hess - - - - ° - ee ¥ « 


The Use of Rorschach Scores to Predict Whether Patients Will Continue monet 
therapy: Frank Auld, Jr., and Leonard D. Eron - - - 


Rorschach Scoring Categories as Diagnostic “Signs”: Martin Berkowitz 
Ot DPR = 2 Bos 6.4 2 «© «© @ © © « 


Group Psychotherapy with Acutely Disturbed Psychotic Patients: 
Herman Feifel and Arnold D. Schwarts - - - - - - - - - = = - = 


The Treatment of Delayed Speech by Client-Centered amas aed J. pent, 
Theodore Landsman, and Milton Valentine - - - ‘. - 


Response to the Human Face as a Standard Stimulus: Ernst G. Beier, Carroll E. 
Izard, Charles D. Smock, and Roland R. Tougas - - = = + = 


Communication and Rapport in Clinical Testing: David Cole - - - - 


A Validation of Changes in Scores on the Index of Adjustment and Values as Measures 
of Changes in Emotionality: Robert E. Bills - - - - - - - - = - 


The Madeleine Thomas Completion Stories Test: Eugene 8. Mills - 
MMPI Profiles and Personality Characteristics: H. Birnet Hovey 


A Comparison of the WISC and menlert: Binet slastusie s of Normal Children: 
Glen A. Holland - - - a“ 3 


Note on Elwood’s Study of IQ Changes: Quinn McNemar_ - 
i sino @ 6b we «6 @. © © 2 © e 











ee 





eg of Consulting Psychology 
ol. 17, No. 2, 1953 


Rigidity and Shock Therapy of Psychotics: 
An Experimental Study 


Maxwell S. Pullen 


Ohio University 


and Ross Stagner 


University of Illinois 


Convulsive shock therapy has, in a little more 
than a decade, become a major tool in the psy- 
chiatric treatment of mental disorders. Particu- 
larly for those psychotics who are inaccessible 
to psychotherapy, it has become practically a 
routine prescription in many hospitals. Yet it 
seems no exaggeration to say that today we 
know relatively little about its actual behav- 
ioral effects (outside of amnesia) ; and certain- 
ly we have no generally accepted theory as to 
the underlying processes involved. Psycholo- 
gists and psychiatrists urgently need a single 
formulation which will clearly relate the symp- 
toms and dynamic structure of the individual 
patient to the effects of shock therapy, if this 
is to be used with maximum efficiency. 

One such formulation is that shock therapy 
modifies rigidity among psychotics [see, ¢.g., 
Billig and Sullivan, 1]. This implies (a) that 
rigidity is a major component of the psychotic 
syndrome, and (6) that one effect of shock 
therapy is to change this component. To verify 
this hypothesis or disprove it, we shall need 
agreement as to the meaning of rigidity, and a 
good measure of its extent before and after 
treatment. 

The notion that psychotics, especially schizo- 
phrenics, are unusually “rigid” has consider- 
able appeal. The schizophrenic is likely to 
show stereotyped behavior, inability to change 
when the situation calls for change. His be- 
liefs are set and resistant to logic. It is difficult 
to get him to “see things differently.” How- 
ever, when we look at the experimental litera- 
ture on the measurement of rigidity, we en- 
counter complications. There are marked dif- 
ferences of opinion as to whether rigidity is a 
unitary phenomenon, and even whether “rigid” 


79 


behavior as defined by the experimenter fits the 
psychotic syndrome just described. 

Since the literature has recently been an- 
alyzed by Cattell [2], Cattell and Tiner [3], 
Fisher [4], Luchins [6], Rokeach [8], and 
others, we shall not attempt another survey. 
Some of the conclusions offered by Fisher will 
be accepted as a starting point for a description 
of our specific research design : 


1. In an intellectually homogeneous group, in- 
dividual differences in rigidity exist. 

2. Individuals with organic pathology are more 
rigid than individuals without such pathology. 


3. Certain kinds of schizophrenics seem to be 
more rigid than other kinds. 


4. Isolated persons seem to be more rigid than 
non-isolated persons who are otherwise comparable. 


5. When different kinds of rigidity measures are 
applied to the same group, there frequently are real 
differences in the results given by the various 
measures. 


On the basis of our study of the literature 
on rigidity and on shock therapy, we concluded 
that there was ample justification for a study 
designed to test the hypothesis that convulsive 
shock therapy is followed by a reduction in the 
rigidity of the patient’s behavior. 


Method 


The factor-analytic studies of Cattell and 
others indicate that there may be more than 
one kind of rigidity. Certainly there are differ- 
ing definitions of the phenomenon. At the out- 
set, therefore, we elected to define rigidity as 
“a generalized tendency to cling to or persev- 
erate with a previously-made response, in lieu 
of switching to a response which is more ap- 
propriate to the prevailing experimental con- 








80 Maxwell S. Pullen and Ross Stagner 


ditions.” The choice of tests was then made in 
conformity with this definition, guided also by 
the general intelligence level and cooperative- 
ness of our proposed subjects. 


We have adopted four tests from the bat- 
tery used by Cattell and his students. One of 
these is a motor test (creative effort), while 
three are perceptual (hidden objects; riddles; 
and flicker fusion threshold). One test was 
borrowed from V. W. Grant (hidden words). 
Four new tests were devised for the study: 
changing colors; changing figures; changing 
pictures; and flicker fusion overlap. Also used 
in the battery was the Short Form Wechsler- 
Bellevue scale. The following brief descrip- 
tions identify each test: 


1. Changing colors: one set of 20 cards varied 
by minute gradations from a saturated red to a 
saturated orange, and another set of 26 cards varied 
from blue to gre n. The subject saw the cards 
within a black exposure box, uniformly illuminated, 
and presented in series.1 He was told that the color 
would change, but that he was simply to name each 
color as he saw it. Each S did four runs, one in 
each direction with each set of cards. The measure 
was the amount of owerlap (persisting with a re- 
sponse past the point at which he had abandoned 
it when going in the other direction). Thus, he 
may have switched from “red” to “orange” on card 
15, but from “orange” to “red” at card twelve. The 
overlap of three cards presumably signifies rigidity 
in failing to change response. 

2. Changing figures: 20 cards ranged from a 
perfect equilateral triangle to a perfect circle; and 
22 cards ranged from a perfect square to a four- 
pointed star. Measure as in test 1. 

3. Changing pictures: 11 cards changing gradu- 
ally from the face of a pirate to a silhouette of a 
rabbit (adapted from Leeper, [5]) ; 13 cards chang- 
ing from young woman to old woman (adapted 
from Leeper) ; 11 cards changing from an old man 
fishing to a young boy (adapted from Luchins). 
The score on this series is simply the number of the 
card at which the new percept is noted. 

4. Creative effort: subject writes “ready” nor- 
mally, for one minute; then writes the same word 
in reverse script, for one minute. Next he writes 
“237” normally and in reverse; and then his own 
first and last name, normally and in reverse. Score 
is given by (x — y/x), in which x is the number of 
letters normally and y the number in reverse stroke. 
(Adapted from Cattell.) 


1All testing was done individually by Mr. Pullen. 
We wish to thank the administration (Dr. O. D. 
Timm, Chief of Professional Services, Dr. L. A. 
Pennington, Chief Psychologist, and Dr. L. E. 
Trent, Manager) at the Danville VA Hospital for 
their cooperation on this study. 


5. Hidden objects: five mimeographed puzzle 
pictures. Score is number of hidden objects located 
in one minute per picture (Cattell). 


6. Hidden words: Subject is shown a card with 
the letters: WI LLOWLEA FF as marked. 


He is shown that the underscored letters give the 
name of an animal. After two more -narked 
samples, he is given the test of 12 unmarked cards, 
at the rate of 30 seconds per card. The score is 
the number of animal names located within the 
time limit.? 

7. Riddles: after a few samples to be sure that 
S understands, he is given 12 riddles, each to be 
solved within 30 seconds. Score: number solved. 


8. Flicker fusion threshold: § was partially dark 
adapted with dark goggles during the riddles test. 
He is now asked to look into a black shadow box 
at a of milk-glass, behind which is a 
Strobotac, set at 3700 RPM. E then slows the 
Strobotac down until S observes flicker, and goes 
all the way down to 600 RPM, then back. Score: 
mean reading of 8 ascending and 8 descending de- 
terminations of threshold (not including one prac- 
tice run each way). Low fusion threshold is assumed 
to indicate rigidity, 


square 


9. Flicker fusion overlap: the mean of the re- 
ported thresholds for the 8 “descending” runs is 
subtracted from the mean for the 8 “ascending” 
runs, on the hypothesis that clinging to the percept 
once established indicates rigidity. 


10. The weighted scores on Comprehension, 
Arithmetic, and Similarities of the Wechsler-Belle- 
vue were added to give an intelligence measure for 
inclusion in the battery. 


Subjects 


The Experimental Group consisted of 35 
psychotic patients from the Danville VA Hos- 
pital who were designated to receive convul- 
sive shock therapy (electric or insulin). The 
Control Group included 25 psychotic patients 
from the Danville VA Hospital ; it was chosen 
to be similar to the Experimentals in the fol- 
lowing respects: (a) average age; (4) diag- 
nostic categories represented; (c) duration of 
illness; (d) degree of incapacitation; (e) 
amount of preshock rigidity, as measured by 
our battery. 

Each experimental subject was tested not 
earlier than 72 hours prior to his first shock 
treatment, and again, not later than 72 hours 
after his last shock treatment. Control sub- 
jects were tested as they were chosen, and later 


2We wish to thank Dr. V. W. Grant, Hawthorn- 
den State Hospital, Macedonia, Ohio, for permission 
to use this test, 


a em 



































Rigidity and Shock Therapy of Psychotics 81 
Table 1 
Intertest Correlations 
Test 2 4 5 6 7 = 8 com cs 10 - 
1. Changing colors 59 31 -41 -.15 —.03 -.01 22 -.01 
2. Changing figures .25 -.38 —.23 —.15 -.07 14 ~.17 
4. Creative effort —.32 ~.32 -.09 —.06 .08 04 
5. Hidden objects 65 57 .06 -.52 22 
6. Hidden words 57 .28 —.24 50 
7. Riddles 18 ~.41 54 
8. Flicker fusion threshold -.21 00 
9. Flicker fusion overlap -.27 
10. Wechsler-Bellevue 
retested at such intervals as to make the test- Table 3 
retest interval comparable for the two groups. Rotated Factor Loadings 
Statistical Analysis Test Factor* 
I’ II’ 1’2 I’? h? 
Our first purpose was to establish a com- —— == 
posite single score for preshock rigidity for 1 78 12 61 01 62 
. 5 . . 9 72 oo S ) $7 
each patient. To achieve this we proceeded as ro po ~ = :. 
. . ° 38 —.4 . VU okt 
follows: the nine tests in the battery were in- ; _40 79 16 62 78 
tercorrelated (Pearson r) to give the matrix 6 22 87 05 16 81 
shown in Table 1. (The “changing pictures” 7 ~.04 89 00 79 79 
g —.06 20 00 04 .04 
9 .18 —56 03 31 35 
Table 2 10 -.08 52 01 27 28 
Reference Factor Loadings ae. te se = oe. 
* I’: “Rigidity.” 
Test Factor Il’: “Intelligence.” 
I II I? II? h?2 , 
We next proceeded to compute z scores for 
1 -49 -61 24 37 61 . _ P ° wee Thi 
2 ~53 51 29 2% 54 individual subjects on the rigidity factor. This 
4 ~.34 .20 12 04 16 was done as follows: Thurstone has pointed 
5 80 19 64 04 68 out that FP = S, where § is the matrix of in- 
6 17 Al 59 17 .76 dividual z scores on the several tests, F is the 
- = 59 48 35 ‘82 factor matrix, and P is the matrix of individ- 
9 = = y oe re ual z scores on the factors involved. Since we 
10 43 30 18 09 '97__~«<4KNow F and S, we can find P by premultiply- 





test, Test 3, was dropped from the statistical 
analysis because on retest virtually every con- 
trol subject recognized it immediately and 
made the switch purely on a memory basis.) 
All 60 subjects (experimental and control) 
were used for these correlations. 

A Thurstone centroid factor analysis was 
made of Table 1. Two factors emerged, which 
upon rotation were identified as a “rigidity” 
factor and an “intelligence” factor (Tables 2 
and 3 and Fig. 1). Heavily loaded for rigidity 
were Tests 1 and 2, with 4 having a moderate 
positive loading, 5 moderate negative. 


ing both sides of Thurstone’s equation by 
(F’F) — 1F’ [10, p. 68], from which we get: 


P,= (FF) —*F%S,. 


This enables us to compute factor scores for 
each subject on Factor I’ and II’.’ 

It is now necessary to obtain change scores 
based on the variation in rigidity from the first 
to second testing. We obtained these by first 
solving the equation: 

8Since the above equation assumes orthogonal 
factors, we rerotated the intelligence factor to an 
orthogonal relation to the rigidity factor. This 
means that the z scores for Factor II’ are not very 


pure measures of intelligence, but since they are 
not used in the further analysis, no harm is done. 








82 


I 


7100 


Maxwell S. Puilen and Ross Stagner 


L +90 


\ +70 


-_ +60 


e. +50 
2 

\ +40 ae 

\ +30 

~ \ +20 


\ 
100 -90 -80 -70 -60 -$0 -40 -30 -20 -10 


\ 


ee 


Fs 





Fs “ shh 
-20-+- 
& -30+ 


-40- 


, -50- 


-60+ 
-70+ 
-80+ 
-90+ 

-100+ 





2 ‘10 20 30 40 50 60 70 80 90 100 


7 
\ 
\ % 
‘Xe 
NY 
\% 
Ne ¢ 
\ 
\ 


\ 
\ 


Fig 1. Original and rotated axes. 


P,= (F°F) — 1F’S, 


in which the same F matrix was used, but 
where S, was computed as follows: each 8§’s 
raw score on the second testing was converted 
into the z score he would have obtained had 
each distribution retained its same mean and 
sigma as on the first administration. We in- 
terpret these as absolute measures of rigidity 
based on the same scale units derived in matrix 
P,. 

Having determined for each § a rigidity z 
score for test and retest, we subtracted the first 
from the second to obtain a measure of change. 


This will be referred to as his D score. The 


score on the original testing will be referred to 
as the R score. 


Results 


Our principal interest is in the effect of shock 
therapy in reducing rigidity in the Experimen- 
tal as compared with the Control group. As 
Table 4 shows, the mean D score for Experi- 
mentals was —.76, Controls —.53, giving a 
net difference in favor of the Experimentals of 
0.23 z-score points. Thus both groups show a 
decline in rigidity, the Experimentals decreas- 
ing more. The confidence level for this differ- 
ence is between .15 and .20; in other words, 
we cannot reject the null hypothesis and assert 





Rigidity and Shock Therapy of Psychotics 83 


Table 4 


Group Comparisons of D-score Means* 
(total sample) 

















“+ 
Group & Mean .% ha 
D Score st ot 4 4 
ZA nA w =) & A 
Experimen- Control 
tal (all) (all) 
—.76 -.53 .23 .24 .98 58 .20>p>.15 
Experi- Control 
mental (all) 
(improved) 
—.96 -~53 43 .24 1.83 47 .05>p>.025 
Experi- Experi- 
mental mental 
(im- (unim- 
proved) proved) 
—.96 ~34 62 451.36 33 .10>p>.05 





*D scores are differences in rigidity factor scores 
(R scores) from Administration I to Administration II 
of test battery. 

tOne-tailed test. 


that the shock therapy has significantly reduced 
rigidity, although the trend is in that direc- 
tion. 


If we reject from the Experimentals those 
patients rated as unimproved by the psychia- 
trists after shock, the mean D score obtained 
is significant at a confidence level between .025 
and .05. The improved and unimproved Ex- 
perimentals differ by a quantity approaching 
significance (confidence level between .05 and 
10). 

The above findings are virtually repeated if 
we limit our sample to schizophrenics.* The 
only interesting change in confidence levels is 
that the Experimental improved vs. unim- 
proved comparison is now highly significant. 

Significance of original R score. Despite the 
fact that the differences in mean D score con- 
sistently favor the group receiving shock, and 
that this difference is statistically significant 
for those patients rated as improved, there may 
be variables other than shock which could ac- 
count for the finding. One of these is size of 
original rigidity score. 

There is a correlation of —.45 between R 
score and D score for the total group. This 
means that, in general, those who were low on 
rigidity at the beginning have tended to in- 

*This involves the elimination of only five cases. 


Data are not reproduced here, but can be found in 
full in Pullen [7]. 


crease, or those who were high to decrease in 
score. However, when the Experimental and 
Control groups are treated separately, we find 
that this is true only within the Experimental 
group (r = —.49), not in the Controls (r — 
—.68). Thus it cannot be simply a manifesta- 
tion of the general tendency to regress toward 
the mean on retesting. 

It is also possible to show that the differ- 
ences are not due to differences between groups 
in original R score. Table 5 shows that the 
Experimentals and Controls differ only at the 
third decimal place on R score mean, and the 
Experimental Improved were actually a shade 
more rigid on initial testing than the unim- 
proved fraction. None of these differences even 
approach significance. 


Table 5 


Group Comparisons of R-score Means* 
(total sample) 





i} 














e+ 
- 
Group & Mean a “Wes | 
R Score 2 at — o »& 
Za ne ~ a ~ ot Pe - : 
Experimen- Control 
tal (all) (all) 
.002 -.004 .01 .28 .02 58 .95>p>.90 
Experi- Control 
mental (all) 
(improved) 
.07 .00 07 .31 .23 47 .90>p>.80 
Experi- Experi- 
mental mental 
(im- (unim- 
proved) proved) 
.07 —14 ..21 47 45 .33 .70>p>.60 





*R scores are scores on the rigidity factor on first 
administration of the battery. 


+Two-tailed test. 


Were the Controls more severely disturbed 
than the Experimentals prior to shock? Each 
patient was rated for degree of incapacitation 
just prior to the first testing. Table 6 shows 
that none of the differences between groups 
in degree of disturbance approached signifi- 
cance. It is, however, interesting to note that 
there was a slight (not significant) tendency 
for the severely disturbed group to make high- 
er R scores than those less disturbed. In view 
of the data shown in Tables 5 and 6, it is clear 
that this has not contributed spuriously to the 
findings on the effect of shock therapy. 








84 Maxwell S. Pullen.and Ross Stagner 


Table 6 
Distribution of “Degree of Incapacitation” 
(DOI) Variable 


Groups Compared 








Improved 
Experimental Improved experimental 
with control experimental with 


with control unimproved 








experimental 
Value of 
chi square 43 1.08 1.16 
Level of 
confidence .50>f>.30 .35>p>.25 .30>p>.20 





Specific tests. The data have also been com- 
pared for each of the nine tests in the final 
battery. Of these, tests 1, 2, and 5 (chang- 
ing colors, changing figures, and hidden ob- 
jects) most clearly show decrease in rigidity 
after shock. Tests 6, 7, and 9 (hidden words, 
riddles, and flicker fusion overlap) show some 
confirming differences. Tests 4 and 8 (motor 
reversal and flicker fusion threshold) contra- 
dicted the trend; shock cases made smaller 
changes away from rigidity than did the con- 
trols, and the Improved cases made smaller 
changes than the Unimproved. 


Intelligence. Of incidental interest may be 
our finding that intelligence was not signifi- 
cantly affected by shock. As other studies have 
shown, our Experimentals showed a slight de- 
cline in Wechsler-Bellevue score, but it is par- 
ticularly interesting to find that the Improved 
group declined less than the Unimproved shock 
cases. None of the differences approach sig- 
nificance. 

Discussion 


We feel that our results justify the follow- 
ing generalizations: (a) rigidity, defined as 
the tendency to persist in a previously made 
response, in the face of changing conditions 
which render the response inappropriate, is a 
generalized factor, manifest in a variety of 
different test situations; (4) rigidity, so de- 
fined, is decreased by shock therapy; and (c) 
persons classified as improved show more de- 
crease in rigidity than patients classified as un- 
improved after shock therapy. However, we do 
not find, either in our own data or in the liter- 
ature, a satisfactory suggestion as to the nature 
of the basic mechanism involved. 

Because of the fairly obvious resemblance of 
our rigidity measures to the familiar negative 


transfer (proactive inhibition) experiment, an 
interpretation in terms of habit interference is 
inviting. The fact that our best measures, both 
of rigidity and of change, were perceptual in 
character tempts us to speculate on a parallel 
phenomenon of perceptual interference. Each 
such approach seems somewhat inadequate, 
however, because it ignores the generalized 
character of the phenomenon. We are not deal- 
ing here with a simple case in which habit X 
interferes with habit Y; instead, we have in- 
dividual M showing, in a variety of situations, 
more interference than individual N. 

As a tentative interpretation we offer the 
following speculations regarding the rigid per- 
sonality. Every individual can be conceived as 
developing certain systems, perceptual and be- 
havioral, which protect his essential inner 
equilibria [Stagner, 9]. The normal personal- 
ity is characterized by perceptual systems of 
alertness to changing stimuli, and response se- 
quences oriented to these external stimuli. For 
other personalities, alert perception of change 
may not have led to maintenance of equilib- 
rium; relatively greater reinforcement may 
have followed from ignoring the changing 
stimulus, and plugging away with an invariant 
response. Could we describe the rigid person- 
ality as one which has learned to protect inner 
equilibrium by blocking out awareness of the 
changes in the external world? This would 
certainly fit some of our schizophrenics! 

As soon as we start defining extremes of 
these response tendencies, however, we become 
aware that normal people show some rigid be- 
havior, and few rigid individuals refuse com- 
pletely to observe and adapt to change. The 
person with a clear-cut reality orientation may 
still, at times, withdraw and ignore the outer 
world, attending presumably to inner states; 
and correspondingly, even the withdrawn 
schizophrenic can be induced to respond to ex- 
ternal stimuli if they are sufficiently demand- 
ing. 

This suggests that we are dealing with a 
balance of two complex response systems, both 
being developed to some extent in every per- 
sonality. Let us label as System R the system 
of perceptual and response tendencies which are 
closely oriented to reality. These are motive- 
satisfying, tension-reducing activities which in- 
volve a need to attend to changing external 











Rigidity and Shock Therapy of Psychotics 85 


stimuli and to modify responses in an appro- 
priate manner. Now let us label as System W 
the system of perceptual and response tenden- 
cies which are oriented away from the external 
stimulus. These likewise are activities which 
have served to satisfy motives, to reduce rela- 
tive tension or to avoid increase of tension. 
Each of us feels a need, at times, to get away 
from painful reality. Fantasy is only one of the 
many devices by which we block responses to 
the present external world. 


We propose, in brief, that all individuals 
develop both System R and System W, but 
that the relative balance of the two varies 
widely. The alert, reality-oriented extrovert 
has learned a predominance of R patterns; 
these have led to a maximum of reinforcement 
in his experience. The resistant, uncooperative, 
rigid schizoid has not found satisfaction in R- 
type responses; on the contrary, his motives 
have been channeled into withdrawing, ignor- 
ing, avoiding patterns. Obviously, under pres- 
sure the W system can be pushed aside and the 
R system activated; or, with the R system sa- 
tiated, W responses may dominate the organ- 
ism. It would be predicted that the extent of 
learning of such systems would determine the 
relative rigidity of each personality. 

In the situation presented by our rigidity 
test battery, the patient no doubt would have 
preferred to continue his W system activities 
(autisms, fantasies). If, however, the W needs 
are relatively satiated, or if R stimuli of con- 
siderable magnitude are present, it is possible to 
activate the R system. Under pressure from the 
experimenter, the patient will look at the card, 
name the color, etc. The R stimulus has ex- 
ceeded the threshold value needed for effective 
competition with the W system. However, the 
R needs satiate readily, the strong W needs 
take over, and he attempts to withdraw from 
the situation. If the experimenter continues to 
press for a response, the patient is likely to re- 
peat his preceding response; this is available 
with little energy required, and it serves as a 
tension reducer by getting rid of the pressure 
from the experimenter. Thus, the schizophrenic 
tests “rigid” not because he is incapable of 
perceiving the stimulus and changing his re- 
sponse, but because he has adopted a different 
device for maintaining equilibrium. As the 
situation looks to him, the easiest way to keep 


the experimenter at arm’s length and thus con- 
tinue with preferred autistic responses, is to 
keep giving a response once found acceptable. 

Why, then, does the schizophrenic change 
his response at all? Eventually, we assume, the 
growing discrepancy between the stimulus he 
is receiving and the response he makes reaches 
a threshold value; i.e., while he ignores a minor 
degree of “inappropriateness,”’ a major dis- 
crepancy is potent enough to block out the W 
system and reinstate the R system. As the ex- 
periments on distraction have shown, we shut 
out disturbing external stimuli by channeling 
more energy into the on-going response system. 
However, even a highly motivated habit pat- 
tern can be disrupted if the external stimulus 
becomes sufficiently intense. 

Does such an interpretation of personality 
rigidity help in the understanding of shock 
therapy? We believe that it does. Here we 
rely particularly upon the amnesic effects of 
shock. The W system of the schizophrenic is 
composed of percepts, autistic habits, fantasies, 
and avoidance behaviors of fairly recent origin. 
His R system, of observing external events, 
obeying verbal instructions, and responding to 
changes in the field, was laid down earlier in 
childhood. Shock, by inducing amnesia for the 
stronger but later habits, “uncovers” the R 
system and frees it from interference. Further, 
by disorganizing the fantasy world, shock may 
make it less attractive, so that potential grati- 
fications in the real world seem more desirable. 

Essentially, then, we are suggesting that 
shock therapy modifies rigidity by its effects 
on memory. In a more precise way, we would 
say that rigidity is a complex form of percep- 
tual interference and habit interference, based 
upon generalized response systems. Learned 
motive patterns support one system oriented to 
reality, to changing stimuli, to demands of 
society, and another system involving responses 
of withdrawal, fantasy and avoidance of social 
requirements. Rigidity develops as the latter 
responses increasingly interfere with the for- 
mer; and rigidity decreases when an event, 
such as shock therapy, breaks up the withdraw- 
al habit patterns and thus eliminates interfer- 
ences with earlier, reality-oriented responses. 


Summary 


1. We have defined rigidity as the tendency 








86 Maxwell S. Pullen and Ross Stagner 


to persist in a previously made response, in lieu 
of switching to one which is more appropriate 
to the stimulus situation. 

2. Factor analysis confirms the view that 
rigidity as defined is a generalized feature of 
psychotic personalities. 

3. Convulsive shock therapy produces a de- 
crease in rigidity scores as defined; this de- 
crease approaches statistical significance for the 
total group, and meets criteria of significance 
for those shock cases rated as improved. 

4. The differences cannot be imputed to 
differences between experimental and control 
subjects in degree of original rigidity, or in 
degree of disturbance prior to treatment. 

5. An interpretation is offered in terms of 
motivation to maintain equilibrium by respond- 
ing alertly to reality, or by withdrawing from 
it. The amnesia induced by shock may break 
up habit systems based on the withdrawal 
orientation, and thus eliminate habit interfer- 
ence with responses based on the earlier reality 
orientation. 


Received September 8, 1952. 


References 
1. Billig, O., & Sullivan, D. J. Therapeutic value 


10. 


of protracted insulin shock. Psychiat. Quart., 
1942, 16, 549-564. 

Cattell, R. B. The riddle of perseveration. I. 
“Creative effort” and disposition rigidity. II. 
Solution in terms of personality structure. J. 
Pers., 1946, 14, 229-267. 

Cattell, R. B., & Tiner, L. G. The varieties 
of structural rigidity. J. Pers., 1949, 17, 321- 
341. 

Fisher, S. An overview of trends in research 
dealing with personality rigidity. J. Pers. 
1949, 17, 342-351. 

Leeper, R. W. Study of a neglected portion 
of the field of learning — the development of 
sensory organization. J. genet. Psychol., 1935, 
46, 41-75. 

Luchins, A. S. Rigidity and ethnocentrism: a 
critique. J. Pers., 1949, 17, 449-466. 

Pullen, M. S. Rigidity in adult male psy- 
chotics and its modification by convulsive shock 
therapy. Unpublished doctor’s thesis, Univer. 
of Illinois, 1952. 

Rokeach, M. Generalized mental rigidity as 
a factor in ethnocentrism. J. abnorm. soc. Psy- 
chol., 1948, 43, 259-278. 

Stagner, R. Homeostasis as a unifying concept 
in personality theory. Psychol. Rev., 1951, 58, 
5-17. 

Thurstone, L. L. Multiple-factor analysis. 
Chicago: Univer. of Chicago Press, 1947. 


oe 


+s ee 


ge of Consulting Psychology 
ol. 17, No. 2, 1953 


A Specific Relapse Phenomenon During the 
Course of Electric Convulsive Therapy ’ 


Roderick W. Pugh 


Veterans Administration Hospital, Hines, Illinois 


This article is the report of one group of 
findings resulting from an exploratory, de- 
scriptive investigation of certain psychological 
processes during courses of concurrent electric 
convulsive therapy and nondirective psycho- 
therapy with paranoid schizophrenia [1]. 
These findings pertain to a specifically local- 
ized relapse in charted improvement, followed 
by recovery, which occurred during the courses 
of treatment. 


Problem and Procedure 


The problem was to chart the course of 
certain psychological processes, and to deter- 
mine the significance of any measurable 
changes in them throughout the treatment 
periods. The psychological processes involved 
were defined by eighteen categories of verbal 
expressions. Fourteen of these consisted of 
expression of attitudes or feelings and were 
designated “affective” categories. Four cate- 
gories were positive, negative, objective, and 
ambivalent attitudes towards the self; four 
were positive, negative, objective, and am- 
bivalent attitudes towards others; two were 
positive and negative attitudes perceived as 
directed towards the self from others; and 
four consisted of the summation of positive, 
negative, objective, and ambivalent attitudes 
towards the self and others. The four remain- 
ing categories of the eighteen were designated 
“nonaffective.” They were: “Statement of 
problem or symptom,” “Understanding or in- 


1From VA Hospital, Hines, Illinois. Reviewed in 
the Veterans Administration and published with the 
approval of the Chief Medical Director. The state- 
ments and conclusions published by the author are 
the result of his own study and do not necessarily 
reflect the opinion or policy of the Veterans Ad- 
ministration. 


87 


sight,” “Discussion of plans,” and “Irrational 
statements or verbalization.”’ 

The investigative procedure called for a 
patient’s receiving sixteen electric convulsive 
treatments at the rate of three per week, and 
ten nondirective interviews approximately one 
hour in length. The convulsive treatments and 
interviews ran concurrently. The sequence 
began with an interview, and thereafter an 
interview was held twenty-four hours or later 
after every second treatment until the sixteen 
convulsive treatments and nine interviews 
were completed. At this point two weeks were 
allowed for recovery from possible “organic” 
effects of the convulsive therapy. Then the 
tenth and final interview was held. The 
median number of days required for the entire 
procedure was 56.5. 

The interviews were electrically recorded 
and later transcribed into verbatim typescript. 
They were then analyzed on the basis of the 
eighteen categories of verbal expressions listed 
above so that trends followed by the psycho- 
logical processes represented could be deter- 
mined. This method of interview analysis 
yielded reliability coefficients between analysts 
of .88, significant at the .001 level of confi- 
dence.? The significance of variations in trends 
was derived by the statistical method of deter- 
mining the significance of the difference be- 
tween proportions as applied to successive pro- 
portions of interviews represented by a partic- 
ular category. This same procedure was ap- 
plied to groups of interviews representing the 
initial (interviews 1 to 3), middle (inter- 


2Additional measures of over-all change were ob- 
tained by comparing before and after test data 
from the Rorschach, MMPI, and Stein Sentence 
Completion Test. These results do not feature in 
the findings under consideration here, however. 








88 


views 4 to 7), and final (interviews 8 to 10) 
stages of the treatment procedure. 


Subjects 


The subjects were six World War II 
veterans drawn three each from two Veterans 
Administration hospitals in the Chicago area. 
They were selected by the criteria of a diag- 
nosis of paranoid schizophrenia by two psy- 
chiatrists, an achievement of a T score of 59 
or above on both the Paranoia and Schizo- 
phrenia scales of the Minnesota Multiphasic 
Personality Inventory, and the selection of 
electric convulsive therapy as a somatic treat- 
ment. They ranged in age from 24 to 35 years. 
The duration of their illness could not be 
precisely determined, but since all had served 
actively in the last world war, it is probable 
that none had been acutely affected for longer 
than five years prior to study. All were re- 
ceiving their first course of electric convulsive 
therapy and none had previous exposure to 
psychotherapy. 


Roderick W. Pugh 


The Relapse Phenomenon 


Trends followed by the psychological pro- 
cesses in each patient over the entire treat- 
ment course predominately represented thera- 
peutic gain. Seventy to 100 per cent of the 
statistically significant® changes recorded for 
each patient were so characterized. For the 
six patients as a group, 61 per cent of the 
categories (11 of 18) demonstrated signifi- 
cant and therapeutically directed change. The 
remainder showed no statistically significant 
changes. Greatest therapeutic gains were rep- 
resented by decreased statements of problems 
or symptoms, and decrement in negative feel- 
ings and attitudes, especially of total negative 
feelings expressed and negative feelings to- 
wards the self and towards others. 

During the course of treatment, however, 
reversals in trends indicative of negative thera- 
peutic movement, and constituting a relapse 
in a general course of improvement, were ob- 


8Criterion for significance is at the .05 level or 
less. 





* ome Positive references to others 
*, moon =sNegative references to others 


ro 
a. 


PERCENT OF TOTAL RESPONSES IN THE INTERVIEW 














5- 
4 
AM Ednop— 
ie) > . + —- . : : 
L 2 3 5 6 7 8 9 10 
INTERVIEWS 


Fig. 1. Average per cent of responses in categories of positive and negative references to others for 


each interview for the six subjects as a group. 


Relapse During Electric Convulsive Therapy 89 
























40 
= 
“ 
> 4 
= oH =———= Total positive feelings 
= 304 \ -=-==Total negative feelings 
s 
ive) 
x ‘ or”, 
- s --* % 
z . ae s 
= -- s 
pa . 
ba s 
. | 
2 . 
2 20+ ‘ 
" RELAPSE 
WwW 
x 
| 
< 
i 
° 
(= 
w 
re) 
+ 10- 
z 
wW 
oO 
c 
WwW 
a. 
, . 
‘° 
0 7 7 7 t a 7 as _s 
I 2 3 o 5 6 7 8 9g 10 
INTERVIEWS 


Fig. 2. Per cent of responses in categories of total positive feelings and total negative feelings ex- 
pressed in each interview by Subject 4. 


15 





104 Perceived feelings of others towards the self 
ame Positive 
ane Negative 


RELAPSE 


} 
rf AN 


/ Y 


5-4 






PERCENT OF TOTAL RESPONSES IN THE INTERVIEW 
a 











“No 


INTERVIEWS 
Fig. 3. Per cent of responses in categories of “perceived feelings of others toward the self” ex- 
pressed in each interview by Subject 1. 








90 Roderick W. Pugh 


served to occur specifically between the tenth 
and twelfth electric convulsive treatments and 
during the period covered by the sixth and 
seventh therapeutic interviews. The observed 
reactions were most characteristic of the affec- 
tive categories, with negative feelings in- 
creased and positive feelings decreased. The 
strongest and most consistent reversals oc- 
curred in “Negative references to others” and 
in total negative feelings. Of almost equal 
strength and consistency were reversals in 
“Positive references to others” and in total 
positive feelings (Fig. 1, 2, and 3). Oc- 
casionally the reaction was demonstrated by 
decreased ability to make objective appraisal 
and by increased perception of negative atti- 
tudes directed towards the self from others. 
Some of the nonaffective categories in which 
the relapse phenomenon appeared in instances 
were “Statements of problems or symptoms” 
and “Irrational statements or verbalization.” 
Interestingly, personal attitudes and feelings 
toward the self exhibited the greatest relative 
stability during the period. It should be 
pointed out here that the relapse phenomena 
were made prominent in graphical records 
because of early and almost immediate re- 


covery of the previous therapeutic course of 
movement. 


The relapse was marked in two of the six 
patients by clinically prominent changes in 
behavior such as increase in delusional mate- 
rial, acting-out of hostility and aggression to- 
wards others, violent rejection of the somatic 
therapy, and in one case, increased agitation 
to the point of requiring sedation. It was 
such a change in behavior on the part of one 
of these patients in particular which caused the 
investigator to examine more closely the thera- 
peutic sessions of the others for the same 
period of the treatment course. In the case of 
four of the six patients, the relapse reaction 
was made evident only by the precise analysis 
of the recorded interviews. More specifically, 


the relapse in these four was not observed at 
the clinical level. 


Discussion 


The fact that psychiatric treatment is 
characterized by exacerbations, whether the 
treatment is somatic or psychotherapeutic, is 


not new knowledge. The curves of all 18 
categories for each of these patients were 
irregular at other periods as well as at the 
period in question. The importance of the 
present observations lies in the fact that a 
qualitatively similar and a particularly prom- 
inent relapse, as revealed by the methods of 
this investigation, fell at a specific locus in 
treatment time for each of these six patients, 
in spite of the fact that the patients were not 
in the same phase of treatment at the same 
time and that they were treated at two sepa- 
rate hospitals. Similar data are not known to 
exist on any other group, nor is the investi- 
gator aware of any reported clinical observa- 
tions which have so narrowly localized such 
a phenomenon. Each of the subjects demon- 
strated the relapse in from 5 to 11 of the 18 
categories. Throughout the group there were 
47 reversals in trends recorded during this 
time. Of these, 59.57 per cent were statistical- 
ly significant. 

Although this study was not controlled to 
partial out the individual effects of the convul- 
sive therapy and the psychotherapy,* there 
seems to be little reason to doubt that the so- 
matic treatment was largely responsible for 
the reactions. Since the patients were studied 
at different times and were divided between 
two different hospital environments, it is most 
unlikely, excluding the factor of the electric 
convulsive treatment, that even the same psy- 
chotherapist or that psychotherapy itself would 
elicit such regularly occurring reactions from 
six different personalities. However, it is with- 
in the realm of tenable speculation that con- 
trolled electrical stimulation to organically 
similar brain tissue could produce a specific 
physiological response from the tissue predic- 
table with respect to time of occurrence. 

The exact type of physiological reaction 
which might be involved is unknown. In 
simplest terms, however, it is possible that, 
while electric convulsive therapy has amelio- 
rative effects, because of its drastic nature the 
attendant stress reaches a critical cumulative 
intensity at a certain point which is sufficiently 


4Psychotherapeutic sessions in this study were pri- 
marily intended as an avenue for investigating psy- 
chological processes rather than therapy as such. 
Thus, to minimize the direction of thinking and 
behavior, a nondirective type interview was utilized. 


Relapse During Electric Convulsive Therapy 91 


disruptive of cerebral physiology and metabo- 
lism that prior therapeutic gains are momen- 
tarily nullified by a relapse attendant to the 
onset of the disruption. The organism reacts, 
however, by mustering all its physiological de- 
fenses to counteract this state of affairs, just 
as the body reacts to all states of shock. This 
rallying of defenses would then coincide with 
the recovery from the relapse. It is hypothe- 
sized that the duration of the recovery is de- 
pendent upon two factors: (a) the duration 
and intensity of continued stress stimulation, 
and (4) the inherent strength of the organism 
to withstand further such excesses. 


The findings for these six subjects are clear. 
The nature of the study, however, precludes 
broader generalizations at present. Should 
further research substantiate the existence 
of a specifically localized relapse in courses of 
electric convulsive therapy of ten or more 
treatments with paranoid schizophrenia or 
other diagnoses, definite implications with 
respect to the efficacy of longer courses of 
treatment are involved. Present findings sug- 
gest that future research might be directed 


towards investigating these additional hy- 
potheses : 


1. That the basic reaction of the relapse 
phenomenon is physiological. 


2. That the stronger and more pervasive 
the relapse, the less favorable the psychological 
prognosis. 

3. That the strength of the relapse is a posi- 
tive correlate of the relative severity of the 
basic psychological pathology. 

4. That more effective results with electric 
convulsive therapy are achieved if, at the rate 
of three treatments per week, there is suffi- 
cient interruption in time after every ninth 
administration to allow for physiological re- 
covery from the stress of the therapy. 


Received August 11, 1952. 


Reference 


1. Pugh, R. W. An investigation of some psy- 
chological processes accompanying concurrent 
electric convulsive therapy and nondirective 
psychotherapy with paranoid schizophrenia. Un- 
published doctor’s dissertation, Univer. of Chi- 
cago, 1949. 








qaee of Consulting Psychology 
ol. 17, No. 2, 1953 


Psychological Changes Following Operations on 
the Human Frontal Lobe’ 


Sidney Crown 
Institute of Psychiatry (Maudsley Hospital), London 


In a previous paper, Crown [2] reviewed 
studies of psychological changes following pre- 
frontal leucotomy, the majority of which had 
appeared before 1949. The clinical follow-up 
studies discussed in that review almost invari- 
ably reported personality changes to follow the 
operation. These changes, however, were usu- 
ally expressed in terms which are difficult or 
impossible to define and measure such as ap- 
athy, lack of spontaneity, fatuous equanimity, 
absence of finer emotional response, etc. 


In order to clarify the field, it was essen- 
tial that an attempt should be made to express 
the changes taking place after operation in 
terms of carefully defined, quantitative hy- 
potheses which could be clearly confirmed or 
rejected. In the present state of knowledge, 
such hypotheses would inevitably ignore the 
complexity and subtlety of the changes which 
take place within any individual patient, for 
it would be necessary to concentrate, at first, 
on broad changes detectable over a group of 
persons. As in all scientific work, however, 
this initial oversimplification is justifiable if it 
leads at a later stage to more detailed experi- 
mental work. 


Eysenck [5] has attempted experimentally 
to establish some of the dimensions or direc- 
tions along which human personalities vary, 
and to define these dimensions operationally 
in terms of psychological tests. Further, in the 
light of reported clinical findings his concep- 
tual scheme seemed relevant to the study of 
leucotomy. Studies in which an attempt was 
made to apply Eysenck’s theory of personality 
organization to the neurosurgical field were, 


1] am very grateful to Mrs. Sybil Eysenck who 
carried out a considerable part of the testing for 
this research and to Dr. H. J. Eysenck for reading 
and criticizing the manuscript. 


therefore, carried out by Petrie, whose reports 
were summarized recently by her [13], and by 
the present author [4]. 

Petrie has worked primarily with neurotic 
patients. She demonstrated [11] that in neu- 
rotics the personality changes three months 
after operation take place along three broad 
dimensions: in the intellectual sphere, there 
was a decrease in intelligence as shown par- 
ticularly on the Wechsler verbal scale and the 
Porteus Maze test. Of the nonintellectual per- 
sonality changes, there was a decrease in the 
conative or characterological factor that Ey- 
senck [5] has called “neuroticism” or emo- 
tional stability ; this was shown by a significant 
decrease in body-sway suggestibility and sup- 
ported by changes on certain other objective 
tests and indices. The third main change was a 
temperamental one, from the introvert to the 
extravert end of the introvert-extravert con- 
tinuum. This temperamental factor is one 
which Eysenck [5] defines in terms of tests 
which differentiate the dysthymic group of pa- 
tients (anxious, depressed, obsessicnal) from 
the hysterics. In Petrie’s study the shift on this 
factor after leucotomy was shown by changes 
such as loss of persistence in a situation involv- 
ing an uncomfortable physical position, a tend- 
ency on several tests to go for speed rather 
than accuracy, a lowering of the level of as- 
piration, more realistic judgment of perform- 
ance, and reduced motor perseveration. She 
confirmed these findings in a retest nine months 
after operation [10], and in a later paper [12] 
showed that after less severe operations the 
nonintellectual changes were in the same di- 
rection but were less pronounced, and changes 
in intelligence were not detectable at all. 

Crown [4] investigated the same hypothe- 
ses in a group of patients the majority of whom 


92 


Changes Following Operations on the Frontal Lobe 93 


were psychotic. There were few statistically 
significant changes in this group following the 
operation, although the direction of the 
changes was consistent with decreases in in- 
telligence and neuroticism. It was not possible 
to demonstrate a consistent change towards 
extraversion in the psychotic patients. 


That the hypotheses were clearly confirmed 
on neurotic and not psychotic patients is un- 
derstandable in the light of later research on 
the “psychoticism” factor. In a recent study, 
Eysenck [6] confirmed the hypothesis that the 
functional psychoses (schizophrenia and manic 
depressive insanity) are not qualitatively dif- 
ferent from normal mental states but form one 
extreme of a continuum which goes all the 
way from the perfectly normal, rational, to 
the completely insane, psychotic individual. 
This personality dimension, “‘psychoticism,” is 
relatively uncorrelated with neuroticism or 
with introversion-extraversion. Psychotic pa- 
tients, such as those tested by Crown [4], are 
not persons who are especially high in their 
position either on the neuroticism or the in- 
troversion dimensions and would not be ex- 
pected to show changes along these dimensions 
comparable in size to those made by neurotics. 
Further, as has been pointed out [7], cognitive 
processes have not yet been linked up prop- 
erly with psychoticism so that it is difficult to 
predict changes in intelligence for psychotic 
subjects after leucotomy on the basis of Ey- 
senck’s dimensional system. In the psychotic 
field, as Eysenck [7] further suggests, the 
main prediction would be that after leucotomy 
there would be a shift along the psychoticism 
axis in the direction of greater normality. A 
test of this hypothesis is the main interest of 
the present investigation. 

More tests of the “psychoticism” factor 
than of the other factors are included in the 
present battery. On the factor of introversion- 
extraversion, on which little work has as yet 
been completed following up that reported by 
Eysenck [5],? only a “marker” test (persis- 
tence) is included to demonstrate the shift on 
this factor. On the whole, tests have been in- 
cluded in the present battery only if enough 


2For the reader interested in the nature of intro- 
version-extraversion, it may be stated that several 
follow-up researches are at present being carried out 
in this department. 


knowledge has been accumulated about them 
to place them fairly accurately as measures of 
a given factor. Experimental tests and tests 
whose position is, as yet, ambiguous because 
of equal saturations with more than one fac- 
tor (e.g., Abstractions, cf. Eysenck [7]) have 
not been used. 

Unfortunately, the present group contains 
both psychotic and neurotic patients. Also 
both full and limited operations have been 
performed. From the considerations cited ear- 
lier in this section then, it may be suggested 
that although the decreases in intelligence, neu- 
roticism, introversion, and psychoticism should 
take place, these changes may be somewhat at- 
tenuated. If, however, the personalities of per- 
sons after prefrontal leucotomies or ablations 
can be shown to change measurably along pre- 
cisely defined psychological dimensions, a use- 
ful preliminary aim will have been accom- 
plished. The time will then be appropriate to 
attempt specific experimental studies in the 
neurosurgical field. 


Organization of the Research 


When the research was begun, it was antici- 
pated that the patients to be operated upon 
would fall into three broad clinical groups: a 
group operated upon for the relief of obses- 
sional illness, tension, agitation and depression, 
a schizophrenic group, and a group of patients 
suffering from intractable pain. Three types 
of cerebral operation were to be carried out: 
open prefrontal leucotomy [15], topectomy 
[14] and cortical undercutting [18]. As there 
is no adequate comparative evidence on the 
merits of these different types of operation, if, 
after careful consideration, it was considered 
that a patient may benefit from a cerebral 
operation, this was assigned to him from the 
three operations under investigation on a ran- 
dom basis. The system of randomization was 
such that each operation would be assigned an 
equal number of times within each broad clin- 
ical group. Patients considered by hospital 
physicians as possibly suitable for neurosurgi- 
cal treatment were presented at a clinical con- 
ference and the decision for or against opera- 
tion made by two senior psychiatrists and by 
the surgeon. In almost all cases this decision 
was made by the same set of persons. 








94 Sidney Crown 


At the present time (18 months after the 
start of the project) 15 patients have been 
operated upon and 13 of these have been fol- 
lowed up for at least 3 months. From the psy- 
chologist’s point of view, of these 13 patients, 
4 were untestable: 3 were too ill to cooperate 
and 1 took an early postoperative discharge. 
Of the remaining 9 patients, 2 could not be 
tested on a large number of the present tests, 
as these patients had been tested earlier in their 
stay in hospital on tests which were so similar 
as to invalidate testing on the present battery. 


This report, therefore, is concerned with 7 
patients who were tested before, and approxi- 
mately 3 months after, operation. Six of the 
patients were woman. The age range of the 
whole group was from 29 years to 53 years 
with a mean age of 40 years. The IQ range 
of the 6 patients who successfully completed 
the Progressive Matrices test (Raven) was 
from under 89 to 126+ with a mean of 108 
points (approximately).* Three of the group 
were suffering from severe obsessive-compul- 
sive illnesses and four from severe depressive 
illnesses with marked tension. Three and pos- 
sibly four of these patients were suffering from 
illnesses of a degree of severity which must, 
clinically and socially, be considered as “psy- 
chotic.” Three, the obsessive-compulsive _ill- 
nesses, following normal clinical practice, must 
be considered as “neurotic.” 


Three of the seven patients had an open 
prefrontal leucotomy, three a topectomy, and 
one a cortical undercutting. This unequal dis- 
tribution of the various types of operation de- 
spite the over-all system of assigning operations 
on a Statistically random basis is the result of 
selecting the seven patients in the present 
group, for reasons specified earlier, from the 
total number of patients operated upon to date. 
In the present group this meant that four out 
of the seven patients had an operation of lim- 
ited severity. On the evidence of Petrie [12] 
the effect of the limited operation is to modify 
the severity of the postoperative personality 
changes. These should still, however, be in the 
hypothesized directions. 


8This figure is approximate because the most and 
the least intelligent members of the group were out- 
side the test norms. 


Method of Analysis 


Lord’s [9] modified ¢ test* was used to test 
the significance of the differences between the 
scores on each test before and after operation. 
Taking each test separately the score of each 
person is compared before and after operation 
and a difference (D) column is calculated. ‘The 
hypothesis to be tested is that the mean of the 
D column is significantly different from zero. 
In Lord’s modified ¢ test the test of signifi- 
cance is based on the use of the sample range 
rather than the sample standard deviation. This 
significance test is slightly less efficient than 
the more usual ¢ test. It is, however, consider- 
ably easier to compute and, where the number 
of tests in an investigation is large and the 
number of persons tested small, is a useful 
significance test. 


Table 1 


Pre- and Postoperative Test Results: Intelligence, 
Neuroticism and Introversion-Extraversion 











M M 
[es N 
ea Pre Post p 
Intelligence 
1. Porteus Mazes 6 93.3 90.5 30.0 NS 


2. Shipley-Hartford 
Abstractions 7 8.6 9.3 3.0 NS 
3. Mill Hill 


Vocabulary 5 18.6 18.0 5.6 NS 
4, Opposites: 

Selection 5 16.6 18.0 e NS 

Creation 5 16.2 14.8 0.9 0.01 
Neuroticism 
5. Word Connexion 

List 7 13.3 12.3 7.0 NS 
6. Maudsley 

Medical 

Questionnaire 7 19.9 14.4 8.1 NS 
7. Perseveration 4 1.77 1.54 0.7 NS 
8. Body Sway: 

Static ataxia 6 38.8 $51.0 53.7 NS 

Sug- 


5 +63.2 +193.0 516.3 NS 


gestibility 


Introversion- 
Extraversion 


9. Leg persistence 7 148.0 92.3 158.3 NS 





*I am indebted to Mr. A. S. C. Ehrenberg, of this 
Institute, for bringing the method to my attention 
and for his advice on other statistical aspects of 
this paper. 


Changes Following Operations on the Frontal Lobe 95 


Results 


The test results are given in Tables 1 and 
2. In each of these tables the following legend 
applies: N = the number of pairs of scores ;° 
M Pre = preoperative mean score; M Post = 
postoperative mean score; s = an estimate of 
the standard deviation of the D column (both 
this estimate and the tests of significance are 
based on the sample range instead of the sam- 
ple standard deviation, cf. [9]); » = level 
of significance (as a definite hypothesis was 
investigated in every case, the one-tailed test 
was used, cf. Jones [8]). 

In the paragraphs below are given brief de- 
scriptions of the tests and methods of scoring. 
The numbers of the paragraphs correspond to 
those in the first column of the tables. Fur- 
ther details of these or similar procedures may 
be obtained from the references quoted or di- 
rectly from the present writer. 


Annotation to Table 1. 


1. Porteus Mazes [16]. Procedure as detailed by 
Porteus. Score: Test Quotient. 
a 


2. Shipley-Hartford Abstractions [19]. A short 
reasoning test, the items of which are steeply grad- 
ed in difficulty. Score: number of items correct. 


3. Mull Hill Vocabulary [17]. A vocabulary test 
of the synonyms type. Alternate forms given before 
and after operation. Score: number of items correct. 

4. Opposites [1]. The two alternate forms of 
Cattell’s opposites tests were used, the first form to 
be administered in the usual way and then the alter- 
nate form to be administered without giving the S 
the usual choice of answers but making him supply 
his own answers without help. The first test is in- 
tended as a “selection” test, the second as a “cre- 
ation” test.6 Both tests were administered without 


time limit. Scores: number of items correct in each 
case. 


5. Word Connexion List [3]. A controlled asso- 
ciation test in which each item consists of a stimu- 
lus word and two possible response words, one of 
which is an “abnormal” response, the other a “nor- 


mal” response. Score: number of “abnormal” choices 
underlined. 


6. Maudsley Medical Questionnaire [5]. This 
test consists of questions dealing largely with physi- 
cal complaints such as are frequently made by neu- 
rotic subjects; a few psychological questions are 


5For practical reasons it was not always possible 
to test or to score all the subjects on every test. 


®This principle is a fairly common one in the de- 
sign of high-level intelligence tests. The particular 
test used in the present investigation is based on 
a suggestion of Dr. H. J. Eysenck. 


also included. Score: number of symptoms shown. 

7. Perseveration [6]. The § is required to write 
a series of Ss for 15 seconds (Period A), then a 
series of Ss reversed for 15 seconds (Period B) fol- 


Table 2 


Pre- and Postoperative Test Results: Psychoticism 








M M 





nant . Pre Post . p 
1. Fluency: 
Birds 7 9.7 99 1.5 NS 
Animals 7 104 13.1 1.5 <0.01 
Flowers 7 12.6 14.4 3.3 NS 
Total 7 32.7 37.4 44 0.02 
2. Mirror Drawing: 
ist Trial 7 165.7 806 66.2 0.01 
2nd Trial 7 88.3 524 44.7 0.05 
3rd Trial 7 656 41.1 45.9 NS 
3. Circles: 
Time 7 20.3 20.4 20.0 NS 
Largest 
Diameter 7 40.6 42.7 11.1 NS 
Smallest 
Diameter , 358 369 335 NS 
Average 
Diameter 7 37.9 39.3 10.4 NS 
4. Squares: 
Time 7 30.1 23.9 26.6 NS 
Largest 
Diagonal 7 45.3 47.3 148 NS 
Smallest 
Diagonal 7 41.1 38.4 104 NS 
Average 
Diagonal 7. #464 .49. 94 NS 
5. Reading Prose ,> 2e &@S. BS NS 
6. Estimating 
distances: 
Overestimation 
3 tests 7 2.6 1.9 6.3 NS 
Underestimation 
3 tests 7” On 13.0 15.5 NS 
Error 8” estimate 7 4.7 2.7 44 NS 
Error 1’ estimate 7 6.4 4.0 5.9 NS 
Error 2’ estimate 7 12.6 8.1 6.7 NS 
Total error 7 23.7 149 15.2 NS 
7. Waves: 
Amplitude 4 25 22.5 2.4 NS 
Wavelength 4 116.2 90.0 15.5 0.02 
8. Tapping 7 576 66.2 16.6 NS 
9. Reversal of 
perspective: 
Passive 6 3.3 as of NS 
Active 6 8.2 GS Te NS 
Inhibited 6 2.5 3.0 2.38 NS 











96 Sidney Crown 


lowed by a 30-second period (Period C) in which 
he writes alternately one S§ followed by an § re- 
versed. Score: d + BIC. 

8. Body Sway [5]. From a string attached to 
the collar a continuous record is made on a kymo- 
graph of the S’s forward and backward sway dur- 
ing (a) a 30-second period when he is told simply 
to stand quite still with his eyes closed (static 
ataxia); and (4) a 2!4-minute period when he is 
given continuous suggestion by gramophone record 
that he is falling forward. The kymograph records 
are scored graphically using a grill of 14-inch 
squares drawn on tracing paper to estimate the 
amount of movement, i.e., the area under the curves. 
Scores: Static ataxia: total forward sway (number 
of -inch squares) + backward sway. Suggesti- 
bility: total forward sway or total backward sway 
whichever is the greater.’ 

9. Leg persistence [5]. The S, sitting, holds his 
leg out horizontally over a chair of the same height 
for as long as he can without touching down. Score: 
time in seconds, 


Annotation to Table 2. 


The reference for all 
Eysenck [6]. 

1. Fluency. Number of birds, animals, and flow- 
ers the § can name, one minute for each. 

2. Mirror Drawing. The § is required, looking 
only in the mirror, to trace round a diamond shape 
the corners of which are indicated on the paper by 
dots. Three trials. Score: time (seconds). 

3. Circles. The S was handed a clean piece of 
paper and asked to draw three circles. Scoring as 
in Table 2: time in seconds, measures in mm. 

4. Squares. The paper with the circles on was 
reversed and the § asked to draw three squares on 
the other side. Scoring as in Table 2: time in sec- 
onds, measures in mm, 

5. Reading Prose. The S was asked to read 
aloud a standard piece of prose. Score: time in sec- 
onds. 

6. Estimating Distances. The S is asked to in- 
dicate what he considered 2 feet distance was hy 
placing two matches two feet apart on the table; 
then one foot; then 8 inches, so that there are four 
matches on the table. These distances are measured 


the tests of Table 2 is 


™We have not used the kymographic method of 
recording body sway before and the scoring prob- 
lems raised by this method have not yet been sys- 
tematically tackled. In particular, falls raise diffi- 
culties of scoring and the records of persons who 
fell have not beén included in the present quantita- 
tive analysis. However, of the falls which occurred, 
3 took place before operation and one postopera- 
tively. This is consistent with a decrease in neu- 
roticism. Secondly, the quantitative scoring units 
(in this study 44-inch squares) are arbitrary units. 
It is hoped to develop a more satisfactory scoring 
system when the precise amount of bodily move- 
ment represented by a given amount of kymograph- 
ic movement has been investigated. 


and recorded. Scoring as in Table 2: quarter-inch 
units. 

7. Waves. A sheet of foolscap size, squared pa- 
per with 4 V’s marked at precise places is given to 
the S. In each of 4 trials the S§ is required, first, 
with the eyes open, to trace over a V; then, with 
eyes closed, to make six more V’s along the same 
line and, as far as possible, the same size. The arm 
is kept up all the while, only the pencil touching 
the paper. Two scores were derived: amplitude 
(average of ist and last amplitude of 4 sets of 
waves, i.e., of 8 measures altogether); and wave- 
length (average wavelength of all 4 sets measured 
parallel to edge of the paper). Measure in mm. 

8. Tapping. Tapping with a pencil on a sheet 
of paper for two trials of 15 seconds each. Score: 
Average number of taps made. Instruction is not to 
tap as fast as possible; score, therefore, represents 
natural tempo rather than maximum rate. 

9. Reversal of perspective. Necker Cube. Illu- 
sion explained to S, Scores: (a) passive, i.e., aver- 
age number of fluctuations during two 30 second 
periods, the one period at the beginning and the 
other at the end of the test. (b) Active, i.e., num- 
ber of fluctuations recorded for 30 seconds when S§ 
is asked to make illusion change as fast as possible. 
(c) Inhibited, i.e, number of changes recorded 
during 30 seconds when § is asked to try and pre- 
vent the cube from changing. 


Discussion 


As the hypotheses investigated are precise, 
in the following discussion importance is at- 
tached both to changes which are statistically 
significant at or less than the 10 per cent level 
and to groups of changes which, although in- 
dividually insignificant, are in the hypothesized 
direction. 

Considering first changes in intelligence, 
5 scores were derived from the intelligence 
tests. Of these, the changes in 4 are insignifi- 
cant and their direction is inconsistent: per- 
formance is improved slightly postoperatively 
in the Shipley-Hartford Abstractions test and 
in the selection of opposites and is diminished 
on the Porteus Mazes and the selection of 
synonyms ( Mill Hill Vocabulary). 

There is, however, one particularly inter- 
esting change on the “Creation” test, one of 
the few experimental tests in the battery in- 
cluded in an attempt to measure changes in 
some of the higher intellectual functions. 
After a cerebral operation the group were af- 
fected to a significant degree in their ability 
to think out without help a word opposite in 
meaning to a stimulus word, although on the 


Changes Following Operations on the Frontal Lobe 97 


“Selection” test, using words of a parallel de- 
gree of difficulty, they show themselves to be 
unaffected or even improved slightly in the 
power to select the correct opposite from a 
number of choices. Not enough persons were 
examined to make a serious quantitative analy- 
sis of errors worth while. However, consider- 
ing the records of the 3 persons in the group 
whose scores on the “Creation” test dropped 
most postoperatively, the errors appeared to be 
of two types. In the first type an opposite of 
inferior stringency was given after operation 
compared with that given before, e.g., patient 
A.M. “Brave” is the opposite of “cowardly” 
(preoperatively), “stupid” (postoperatively) ; 
patient A.W.S. “all” is the opposite of “‘none” 
(preoperatively), “a few” (postoperatively). 
The second type of inferior response occurred 
when the § made no real effort to find an op- 
posite, e.g., patient H.E.L. “Brave” is the op- 
posite of “coward” (preoperatively), “not 
brave” (postoperatively) ; or “save’’ is the op- 
posite of “spend” (preoperatively), “not save” 
( postoperatively). 

Thus although the results of the usual in- 
telligence tests—performance (Porteus), rea- 
soning (Shipley) and verbal (Mill Hill Vo- 
cabulary, selection of opposites)—showed no 
impairment, the results from the “Creation” 
test do suggest that high-level intellectual 
changes may take place after neurosurgical 
operations. 

Petrie [12], using the Wechsler-Bellevue 
vocabulary and other verbal tests, has also re- 
ported that her patients show carelessness in 
the use of language after leucotomy and she 
suggests that these intellectual changes may be 
more characteristic of the posterior operations. 
In the present small series of 5 cases tested 
on this test, of three patients who showed the 
greatest postoperative deficit on the creation of 
opposites test, two had the open prefrontal 
leucotomy, the most severe of the operations 
carried out in our series. 

It is difficult to determine how far such 
changes are a “true” intellectual deficit and 
how far they are secondary to changes of a 
nonintellectual type such as in the will or mo- 
tivation to work or carelessness at work. This 
is the difficulty which Petrie [13] also came 
up against in attempting to explain her find- 
ings. She provides no convincing evidence for 


deciding whether the change is a “true’’ in- 
tellectual change or not, and in the present 
experiment it is equally impossible to do this. 

On the tests of neuroticism, none of the 
changes is significant but the majority are in 
the hypothesized direction. The group score 
less abnormally on the Word Connexion List, 
on the Maudsley Medical Questionnaire, and 
lower on the test of Perseveration. Only on 
the Body Sway test are the scores in the 
wrong direction for the hypothesis. There is, 
however, considerable variability in individual 
performance on both the tests of Static Ataxia 
and Suggestibility after operation as the sta- 
tistics quoted in the previous section show. 
Larger numbers are needed to determine with 
any confidence whether the changes in score 
on this test are truly inconsistent or whether 
a more definite change in one direction or an- 
other may be discerned. Further, as stated 
earlier, falls were not taken into account in 
the present scoring although the greater pre- 
operative occurrence of this phenomenon is as 
expected. On the whole, then, the results show 
a discriminable but attenuated shift in the 
scores of the group towards the normal end 
of the normality-neuroticism continuum. On 
a mixed group of neurotic and psychotic pa- 
tients this result is what, as stated earlier, 
would be expected. 

A single “marker” test, Leg Persistence, 
was included to check the tendency of the 
group to move towards the extraverted end of 
the introversion-extraversion personality di- 
mension. As with the tests of neuroticism, only 
a slight decrease in score on this test would be 
hypothesized in a mixed neurotic and psychotic 
group. Thus the observed, but nonsignificant, 
decrease in leg persistence is as expected. 

Of the 28 scores derived from the tests of 
the psychoticism factor, 21 are in the direction 
which would be hypothesized if the scores of 
the group are becoming more typical of the 
normal than the psychotic person, and of these 
5 are statistically significant. The simple bi- 
nomial test shows that the occurrence of 21 
out of 28 changes in the hypothesized direc- 
tion is significant (i.e., would not be accounted 
for by chance) assuming that each of the 
scores is independent. This is not so, however, 
as several scores are often derived from the 
same test. Accordingly, this significance test 








98 Sidney Crown 


must be interpreted with caution. If it had 
been insignificant, the data would hardly have 
been worth considering further. Being signifi- 
cant, but under these limiting conditions, the 
results may be regarded as at least suggestive. 

If Eysenck’s [6] paper, in which he has 
demonstrated some of the personality traits 
which distinguish the normal from the psy- 
chotic person at a high statistical level of con- 
fidence, is taken as the basis for our hypotheses, 
then the shift towards normality on the psy- 
choticism factor is shown clearly on all the 
tests included in the present battery except the 
Drawing Circles test. The shift is shown es- 
pecially well on the Fluency tests, the Mirror 
Drawing test, and the Waves test. Thus, 
after leucotomy the group becomes more flu- 
ent, performs better on the Mirror Drawing 
test, increases the rate of oscillation on the 
Reversal of Perspective test, makes smaller 
movements on the Squares and Waves tests, 
shows a lesser tendency to overestimate dis- 
tances and reads and taps more quickly. De- 
spite the small numbers and the fact that the 
group includes both neurotic and psychotic pa- 
tients, these changes support the hypothesis of 
a decrease on the psychoticism factor fairly 
convincingly. 

On the whole, therefore, the data confirm 
the hypotheses. With a mixed neurotic and 
psychotic group, where operations both of lim- 
ited and greater severity have been carried 
out, attenuated but measurable psychological 
changes take place postoperatively. At a high 
level, intellectual deficits may occur. In addi- 
tion there are changes on nonintellectual per- 
sonality factors such as introversion-extraver- 
sion, neuroticism, and psychoticism which are 
measurable at least on a group basis. A de- 
tailed exploration of the factors underlying 
these broad personality changes should now be 
profitable. 


Summary and Conclusions 


This study is one of several in which an 
attempt has been made to apply Eysenck’s the- 
ory of personality organization to the neuro- 
surgical field. 

Objective tests were used to measure some 
of the personality changes taking place after 


operations on the frontal lobe. 

In terms of tests used to define these fac- 
tors, it was hypothesized that, postoperatively, 
patients would tend to show a decrease in in- 
telligence, a temperamental change from intro- 
version towards extraversion and charactero- 
logical changes towards normality on the neu- 
roticism and psychoticism dimensions. 

Despite tac small numbers studied, the in- 
clusion of both neurotic and psychotic patients, 
and operations of different degrees of severity, 
the data tend to confirm these hypotheses. 

So far as this is possible, attempts should 
now be made to study the mechanisms under- 
lying these observed personality changes. 


Received August 11, 1952. 


References 


1. Cattell, R. B. Cattell Intelligence Tests. Scale 
II: Forms A & B. London: Harrap, 1935. 


2. Crown, S. Psychological changes following 
prefrontal leucotomy; a review. J. ment. Sci., 


1951, 97, 49-83. 

3. Crown, S. The Word Connexion List as a di- 
agnostic test: norms and validation. Brit. J. 
Psychol., 1952, 48, 103-112. 

4. Crown, S. An experimental study of psycho- 
logical changes following prefrontal lobotomy. 
J. gen. Psychol., 1952, 47, 3-41. 

5. Eysenck, H. J. Dimensions of personality. Lon- 
don: Kegan Paul, 1947. 

6. Eysenck, H. J. Schizothymia-Cyclothymia as 
a dimension of personality. II. Experimental. 
J. Pers., 1952, 20, 345-384. 


7. Eysenck, H. J. The scientific study of person- 
ality. London: Routledge & Kegan Paul, 1952. 

8. Jones, L. V. Tests of hypotheses: one-sided vs. 
two-sided alternatives. Psychol. Bull., 1952, 
49, 43-46. 

9. Lord, Edith. The use of range in place of 
standard deviation in the f-test. Biometrika, 
1947, 34, 41-67. 

10. Petrie, Asenath. Personality changes after pre- 
frontal leucotomy. Brit. J. med. Psychol., 1949, 
22, 200-207. 

11. Petrie, Asenath. Preliminary report of changes 
after prefrontal leucotomy. J. ment. Sci., 1949, 
95, 449-455. 

12. Petrie, Asenath. A comparison of the psycho- 
logical effects of different types of operations 
on the frontal lobes. J. ment. Sci., 1952, 98, 
326-329. 

13. Petrie, Asenath. Personality and the frontal 
lobes. London: Routledge & Kegan Paul, 1952. 


Changes Following Operations on the Frontal Lobe 99 


14. Pool, J. L. Topectomy. Lancet, 1949, 2, 776- 


15. 


16. 


781. 


Poppen, J. L. Technic and complications of 
the standard prefrontal leucotomy. In M. 
Greenblatt, R. Arnot, & H. C. Solomon. (Eds.), 
Studies in lobotomy. New York: Grune & 
Stratton, 1950. 

Porteus, 8. D. The maze test and mental dif- 
ferences. Vineland, N. J.: Smith Printing 
House, 1933. 


17. 


18. 


Raven, J. G. The Mill Hill Vocabulary Test. 
London: H. K. Lewis, 1944. 


Scoville, W. B. Selective cortical undercutting 
as a means of modifying and studying frontal 
lobe function in man. J. Neurosurg., 1949, 6, 
65-73. 

Shipley, W. C. A self-administering scale for 
measuring intellectual impairment and dete- 
rioration. J. Psychol., 1940, 9, 371-377. 








Journal of Consulting Psychology 
Vol. 17, No. 2, 1953 


Generalization of Problem-Solving Rigidity’ 


Emory L. Cowen,’? Morton Wiener, and Judith Hess 


University of Rochester 


Problem-solving or Einstellung rigidity has 
been defined as the tendency to adhere to an 
induced behavior when it ceases to represent 
the most direct path to a goal [4]. Consider- 
able research interest has surrounded this 
phenomenon in recent years, both in terms of 
its basic parameters and alleged personality 
correlates. With respect to certain aspects of 
the problem, there appears to be considerable 
agreement amongst the findings of independ- 
ent researchers to date. Thus for example, 
starting with the classic studies of Luchins 
[8] in 1942, the susceptibility of Einstellung 
figidity to a variety of specifiable field condi- 
tions now appears to be firmly established. 

Perhaps the single issue in this general area 
of research about which agreement is most 
lacking is whether or not this type of be- 
havioral rigidity bears relationship to other 
personality variables. That such a relation- 
ship may exist is implied in the work of the 
California group [1, 6], particularly that 
of Rokeach [10], as well as in the studies 
of Cowen and Thompson [4], Fisher [5], 
and Harway [7]. Luchins in his critique of 
Rokeach’s work [9] takes an opposite view- 
point stating that “rigidity is not a function 
of the personality per se, but of particular 
field conditions.” 

In the present experimental approach to 
this problem, we have started with the obser- 
vation that when all pertinent variables are 


1Portions of this research were presented in a pa- 
per read to the Division of Personality and Social 
Psychology, American Psychological Association, 
Washington, D. C., September, 1952. 


2This investigation was supported in part by a 
research grant to the senior author from the Na- 
tional Institute of Mental Health, National Insti- 
tutes of Health, Public Health Service. The specific 
study reported in the present paper is part of a 
broader research program in the area of sociopsy- 
chological and personality correlates of psychologi- 
cal rigidity. 


controlled (i.e., intelligence, age, etc.) and 
under a constant set of field conditions, indi- 
vidual differences in rigidity of response con- 
tinue to be manifested. If, with the neces- 
sary controls exercised, it can be demonstrated 
that this state of affairs is not solely a func- 
tion of specific task attitudes (i.e., toward 
arithmetical problems in the case of the water- 
jar technique), we shall be able to lend 
greater credence to the existence of some 
concept of a personality-related mode of prob- 
lem-solving behavior — at least under a given 
set of field conditions. 


[he research problem, viewed in these 
terms, becomes a relatively simple and clear- 
cut one which may be described in two steps: 


1. ‘To develop additional measures of prob- 
lem-solving rigidity which are structurally 
similar to the water-jar technique, but which 
tap different areas of functioning. 


2. To test the hypothesis that there will be 
a positive relationship between rigidity be- 
haviors as measured by the several tasks. 


Procedures and Results 


Subjects in the present study were 59 col- 
lege undergraduates, 30 females and 29 males. 
Each § was seen individually by one of three 
examiners for two testing sessions approx- 
imately four weeks apart. 

Three tests were administered, each de- 
signed to measure problem-solving rigidity in 
a different area of functioning (i.e., the 
Luchins water-jar technique, an original al- 
phabet maze, and an original motor maze). 
These tests were constructed so as to be 
structurally similar and capable of measuring 
the tendency to adhere to an induced method 
of problem solving when this method ceases 
to be the most direct one. In the case of the 
motor maze, however, the selected problems 


100 


Generalization of Problem-Solving Rigidity 


PRACTICE 





SET 


101 


CONTROL~ CRUCIAL 






























































XW!—|n2 
micia| vim 
</O/xXA/sS\< 
<i\MW\O;}r mM 
DIN |AlX<|w 











Q|DIDIO|N|@ 
Z\S\<|wW/A\= 
—(|nla|xric|o 
OIN/C/Aa;AnA;mM 
z=\—|r|o|m|< 
DlQ|nImM| wr 

















4|x<x/O/|Lris!— 
—/pi<iQiZzioO 
C|Q@)xrin\4|@ 
WO) <|DW)/O}O 
OMI DAINIMT 
Si< ic lOIN|=z 



































SOLUTION: 
SEVEN FISH 


SOLUTION: 


LET HIM SING 





LONG SOLUTION: 
HER RED SUIT 


DIRECT SOLU7ION: 
HER HAT 








Fig. 1. Sample problems from the alphabet maze series. 


proved to be too simple; hence no measure 
of rigidity was obtained. Primary considera- 
tion will therefore be given to the relation- 
ship between the water-jar and alphabet-maze 
techniques. 


With respect to administration, the water- 
jar technique was always given first. Follow- 
ing this, one-half of the Ss received the alpha- 
bet-maze technique and the remainder were 
given the motor mazes. The third rigidity 
measure was given in the second testing 
session. This procedure was followed to deter- 
mine whether obtained correlations between 
rigidity measures would differ as a function 
of the temporal interval between tests. Since 
this was found not to be the case, it was con- 
sidered defensible to pool alphabet-maze scores 
from the two administrations. 

Since various adaptations of the water-jar 
technique as well as instructions for its ad- 
ministration are reported elsewhere [3, 4, 8, 
10], the actual test problems used will not be 
reproduced here. Suffice it to mention that the 
series consisted of two practice problems, de- 
signed to familiarize Ss with the various steps 
that could legitimately be utilized in the 
solution of later problems; one control prob- 
lem which could be solved either by a long 
method (B-A-C-C) or a direct one (A-C) 


which is actually one of the steps of the long 
method; five Einstellung or set problems 
soluble only by the long method ; four crucial 
problems, similar to the control problem and 
soluble either by the long or direct method; 
one extinction problem soluble only by a new 
and direct method not used in any previous 
problem; and finally two additional crucial 
problems. 


No published account of the alphabet-maze 
technique is available in the psychological 
literature, although Cowen [2] reports early 
pilot work with this technique. For present 
experimental usage the test series was ar- 
ranged so as to parallel exactly the water-jar 
technique. The same number of practice prob- 
lems, controls, set-builders, crucials, and ex- 
tinctions were used, in the same order as for 
the water-jar technique. 


In the administration of the alphabet-maze tech- 
nique, § is told that the object of the task is to move 
from the upper right-hand corner of the maze to the 
lower left-hand corner, spelling out words on the 
way. He may move one box at a time in any direc- 
tion (up, down, left, right, or diagonal) as long as 
the move he makes helps to spell out a word. The 
answer which he is to write in the booklet consists 
of the word or words that will take him from 
start to finish of the maze. In case more than one 
path is available, the correct solution is the one that 








102 


uses the fewest number of boxes, Figure 1 consists 
of a sample practice, set, and “control-crucial” 
problem used in this series. 

The long solution used consists of moving along 
a down-left diagonal, then straight down, and fi- 
nally directly to the left. The direct solution is a 
straight diagonal path from the upper right-hand 
corner to the lower left-hand corner. The long s0- 
lution requires nine moves; the direct one only five 
moves. 


Problems in both test series were presented 
one by one and in order, on individual three- 
by-five index cards. Responses were recorded 
by Ss in examination booklets — a fresh page 
being taken for each problem. 


The rigidity score for each subject on each 
instrument consisted of the total number of 
crucial problems solved by the long method. 
In the case of the extinction problem nonsolu- 
tion in the 2!/4-minute time period allotted 
was considered to be a rigid response. Thus, 
the highest possible rigidity score on each 
instrument was seven. 


Subjects were eliminated for either of two 
reasons: (a) a long solution on the control 
problem, and (4) mathematical or word errors 
on the test problems. 


By these criteria, 47 subjects had acceptable 
records on both tests. The computed Pearson 
r between water-jar and alphabet-maze rigid- 
ity was .42 which was significant beyond the 
.01 level. Since an analysis of test scores indi- 
cated essentially dichotemous, U-shaped dis- 
tributions for both measures, a phi coefficient 
was computed and found to be .46. The sig- 
nificance of this coefficient as tested by y* was 
beyond the .01 level of probability. A tetra- 
choric r of .59 estimates the degree of rela- 
tionship between the instruments based on the 
assumption of normality of distribution. 


Discussion 


General confirmation of the original hy- 
pothesis is found in the low positive but sig- 
nificant correlation between the two rigidity 
measures. Under the specific and constant field 
conditions used in the present investigation, 
and with other pertinent variables nonoperant, 
individuals do respond differentially in terms 
of rigidity behavior on any given task. More- 
over, we can not account for such response 
tendencies solely on the basis of specific task 


Emory L. Cowen, Morton Wiener, and Judith Hess 


attitudes, in the light of its generalization to 
a different area of functioning. 

Though the present investigators attach im- 
portance to this finding, caution must be 
exercised in its interpretation and generaliza- 
tion. A sweeping generalization, such as that 
problem-solving rigidity is a function of per- 
sonality, is not indicated or implied. Quite 
conceivably there could be situations in which 
field factors are so compelling as to preclude 
entirely the operation of personality factors. 
Illustratively, had the experimenters used 100 
set problems, very likely all Ss would have 
responded “rigidly” on all subsequent cru- 
cials. On the other hand there appears to exist 
a potential “balance point”. in experimental 
field conditions, at which a relatively wide 
range of rigidity responses are elicited from 
individuals of comparable age, intelligence, and 
education. Under such conditions in the pre- 
sent study, a tendency towards a generalized 
mode of problem solution has been demon- 
strated, and the likelihood that this tendency 
is “‘personality-related” strengthened. 

The precise relationship between rigidity 
and other personality variables is likely to vary 
considerably under different field conditions. 
It may well be that this fact underlies many 
of the discrepant findings in this area to date. 
We advance a general hypothesis, however, 
that the relationship will be greatest when 
field conditions are such as to bring about, in 
an otherwise homogeneous group of subjects, 
a U-shaped type distribution of rigidity scores 
described above. 

A final issue which merits some considera- 
tion centers around the question of what is 
the most fruitful direction that subsequent 
research in this area can follow. We propose 
that it is not helpful and very probably not 
even meaningful to ask a question such as, “Is 
there a relationship between rigidity and per- 
sonality?” Such a phrasing, while perhaps 
inviting to those who seek simple linear rela- 
tionships, overlooks the multiplicity of settings 
in which the covariation of these variables 
may conceivably be studied. The present 
authors view with considerably greater opti- 
mism an approach which seeks to specify 
precisely a wide variety of experimental field 
conditions and to examine the relationship 
between problem-solving rigidity and other 


Generalization of Problem-Solving Rigidity 


designated personality variables, under these 
measurable conditions. 


Summary 


In order to test the hypothesis of generaliza- 
tion of problem-solving rigidity the authors 
have developed an alphabet-maze scale, de- 
signed to parallel the Luchins water-jar tech- 
nique in structure, but to deal with a differ- 
ent area of cognitive function. 

The two instruments were given to 59 
college undergraduates, and on the basis of 
the scoring criteria used, 47 pairs of usable 
test records were obtained. The significant 
correlation between the two measures was 
taken as a confirmation of the original hy- 
pothesis. Under the particular field conditions 
employed, a generalization of problem-solving 
rigidity has been demonstrated. 

The limitations of the present findings, as 
well as their implications for subsequent re- 
search have been considered. 


Received August 11, 1952. 


References 


1. Christie, J. R. The effects of frustration on 


10. 


103 


rigidity in problem solution. Unpublished doc- 
tor’s dissertation, Univer. of California, 1949. 
Cowen, E. L. A study of the influence of vary- 
ing degrees of psychological stress on problem 
solving rigidity. Unpublished doctor’s disserta- 
tion, Syracuse Univer., 1950. 

Cowen, E. L. Psychological stress and problem 
solving rigidity. J. abnorm. soc. Psychol., 
1952, 47, 512-519. 

Cowen, E. L., & Thompson, G. G. Problem 
solving rigidity and personality structure. J. 
abnorm. soc. Psychol., 1951, 46, 165-176. 
Fisher, S. Patterns of personality rigidity and 
some of their determinants. Psychol. Menogr., 
1950, 64, No. 1 (Whole No. 307). 
Frenkel-Brunswik, Else. Intolerance of am- 
biguity as an emotional and perceptual vari- 
able. J. Pers., 1949, 18, 108-143. 

Harway, N. I. Personality variables in preb- 
lem solving rigidity inferred from behavior 
in the level of aspiration situation. Uapub- 
lished doctor’s dissertation, Univer. of Roches- 
ter, 1952. 

Luchins, A. S. Mechanization in problem solv- 
ing: The effect of Einstellung. Psychol. 
Monogr., 1942, 54, No. 6 (Whole No. 248). 
Luchins, A. S. Rigidity and ethnocentrism: A 
critique. J. Pers., 1949, 17, 449-466. 

Rokeach, M. Generalized mental rigidity as a 


factor in ethnocentrism. J. abnorm. soc. Psy- 
chol., 1948, 43, 259-278. 








Journal of Consulting Psychology 
Vol. 17, No. 2, 1953 


The Use of Rorschach Scores to Predict Whether 
Patients Will Continue Psychotherapy 


Frank Auld, Jr., and Leonard D. Eron’* 


Yale University 


Interpreting psychological tests is a serious 
matter, for interpretations affect the lives of 
patients. Decisions to accept patients or reject 
them, to try psychotherapy or not, to use one 
method of treatment instead of another—all 
these may be made on the basis of psychological 
reports. The psychologist, then, has an ethical 
responsibility to be sure that his statements 
are founded on scientific knowledge. He can- 
not afford to rely on guesswork; he must be 
as right as scientific method can enable him to 
be. In this paper we point out that the clinical 
psychologist who wants to be right must be 
sure that the diagnostic devices he uses have 
been cross validated. Of course, practically 
all psychologists agree with this proposition in 
the abstract. We hope to buttress the logic of 
this matter with concrete examples so that the 
reader will never be able to forget the rule: 
If the combining weights of a set of predic- 
tors have been determined from the statistics 
of one sample, the effectiveness of the pre- 
dictor-composite must be determined on a 
separate, independent sample [ Mosier, 9]. 


Background of This Study 


Many psychiatric clinics have long waiting 
lists and can accept for treatment only a few 
of the many people that apply. Thus there is 
a great need for diagnostic methods that can 
predict which of the applicants can profit most 
from therapy. Kotkov and Meadow [6, 7] 
have proposed a method of doing this by 
means of certain Rorschach scores. These 
authors tried out all of the common Ror- 

1 The authors wish to thank Dr. Mark A. May, 
director of the Institute of Human Relations, for the 
Institute’s financial support of this study. Dr. Neal 


E. Miller read this paper and made helpful sug- 
gestions. 


schach scores; they found only three that dis- 
criminated between patients who continued for 
at least nine interviews and patients who quit. 
By means of Fisher’s discriminant function 
Kotkov and Meadow combined these scores 
into a prediction formula: Y = .00038 R + 
.00007 D% + .00241(FC—CF). (R is the 
total number of responses to the Rorschach; 
D%, the percentage of these that are large 
usual details; FC — CF, the number of form- 
color responses minus the number of color- 
form responses.) The weights for this form- 
ula were derived from a sample of 98 patients 
who received group psychotherapy at the Bos- 
ton VA Mental Hygiene Clinic. When ap- 
plied to this sample, from which the weights 
had been derived, the formula correctly clas- 
sified 69% of the subjects as “continuing” 
or “discontinuing” patients. 


Kotkov and Meadow then applied this 
formula to a sample of 52 individual psycho- 
therapy patients in the Boston clinic. Note 
that this is not a classical cross-validation de- 
sign, since the patients were not drawn from 
the same population (namely, group therapy 
patients) as the patients in the original sam- 
ple. This type of study, in which weights de- 
termined on a sample from one population 
are applied to a second sample drawn from a 
differently defined population, may be called 
“validity generalization” [Mosier, 9]. When 
applied to the individual therapy sample the 
formula discriminated continuers from quit- 
ters just as well as it had in the original sam- 
ple of group therapy patients, making correct 
predictions for 36 of the 52 patients (69%). 
Not all of the scores entering into the form- 
ula, however, discriminated continuers and 
quitters among the individual psychotherapy 


104 


Rorschach Scores and Continuing Psychotherapy 


patients ;? only the FC—CF score did so to a 
statistically significant degree. The R_ score 
was almost significant for the group therapy 
patients and statistically significant for the in- 
dividual therapy patients. The D% score was 
significant at the .30 level for the group ther- 
apy patients, but the direction of the differ- 
ence was reversed in the individual therapy 
sample. It is surprising that there was no loss 
in predictive efficiency when the formula was 
applied to a new sample. Shrinkage in multi- 
ple correlation (the discriminant function is 
analogous to multiple correlation) has been 
the regular finding of psychologists who have 
utilized prediction formulas. 

Since predictions stemming from this form- 
ula are based solely on the addition of three 
scores, it might be felt that the formula does 
not get the best possible predictions out of 
the data. We believe, however, that the form- 
ula probably does as well with the materials 
it embraces (the commonly used Rorschach 
scores) as any subjective method for making 
predicitions from these scores would do. The 
reasoning of Thorndike [12, pp. 200-201] 
and the experience of Kelly and Fiske [4, pp. 
200-202] support this view. 

Although Kotkov and Meadow’s prediction 
formula worked well when applied to the 
Boston samples, we felt that it might not be 
equally effective when applied to new samples 
of patients. We shall consider the reasons for 
this belief later on, in the discussion section. 
Our belief that there would be loss in predic- 
tive efficiency led us to try out their formula 
on a new sample of patients. 


Procedure 


Subjects. The subjects are 33 patients who 
received individual psychotherapy in the Psy- 
chiatric Outpatient Clinic at the New Haven 
Hospital. All patients who received individual 
psychotherapy during the period from July 
1949 through June 1952, who were treated by 
a regular staff member (psychiatric resident) 
of the clinic or by a psychology graduate stu- 
dent under close supervision, and who took the 
Rorschach test at or near the beginning of 
treatment (before the ninth treatment inter- 
view) are included in our sample. Two-thirds 


2 Personal communication from Dr. Meadow. 


105 


of these paiients were diagnosed as neurotic; 
the remaining third were diagnosed as suffer- 
ing from psychosis or character and behavior 
disorder. 

Not all patients seen in the clinic during this 
period were tested, since the psychologists’ 
work load permitted testing a patient only 
when the therapist specifically requested it. 

Assignment to continuing or discontinuing 
group. A patient was assigned to the “continu- 
ing” group if he had kept at least 9 appoint- 
ments for psychotherapeutic interviews or if he 
had quit before the ninth interview with the 
consent of the therapist. Otherwise the patient 
was assigned to the “discontinuing” group. 
There were 21 patients in the continuing group 
and 12 in the discontinuing group. 

In counting the number of interviews, we 
started with the one immediately following the 
test administration. Interviews before the test- 
ing did not count toward the total of nine. 
Practically this did not make any difference 
because no patient who was assigned to the dis- 
continuing group had as many as 9 interviews 
whether or not the pretesting sessions were 
counted. 

Recording of Rorschach scores. Records 
were scored according to the procedures de- 
scribed by Klopfer and Kelley [5]. The 
original scoring was reviewed by the authors 
who reached a consensus on the scoring of all 
the responses. The total number of responses, 
the number of large-usual-detail responses, the 
number of FC responses, and the number of 
CF responses were recorded for each patient. 

Computation of Y score. AY score for each 
patient was computed according to the Kotkov- 
Meadow formula. 


Results 


The Kotkov-Meadow formula failed to pre- 
dict which of the 33 patients continued in 
therapy and which did not. When patients 
with a Y score of .01132 or higher were as- 
signed to the “continuing” group and those 
with lower scores to the “discontinuing” group, 
17 out of 33 patients were correctly classified 
(52%). Obviously, this is no better than we 
could do by tossing a coin or throwing dice. 
The biserial correlation between Y score and 
the criterion was + .19, which is not signifi- 
cant. 








106 


Although the Kotkov-Meadow formula did 
not predict continuance of psychotherapy 
among our group, there was a chance that one 
or two of the scores utilized in the formula 
would do so. Accordingly, we tested the cor- 
relation of R, D%, and FC — CF with the 
criterion. The chi-square test showed only R to 
be s.gnificantly associated with continuance in 
therapy (7? = 7.14, with 1 df; p < .01). 
When patients were assigned to the “contin- 
uing’’ group if they gave 19 or more responses 
(18 was the median number for all 33 pa- 
tients), 22 of the 33 patients were correctly 
classified: (67%). 

In order to test whether there was a con- 
tinuous relationship, over the whole range of 
R scores, between R and perseverance in thera- 
py, we applied Festinger’s d test for rank- 
ordered data [2]. This test does not involve 
any assumption about the form of the distribu- 
tion function. According to this test the as- 
sociation between number of responses and con- 
tinuance in therapy was not statistically sig- 
nificant (p = .11). 

The association between FC—CF score and 
continuance in therapy is in the opposite direc- 
tion from that predicted by Meadow’s formula. 
In our sample of 33 patients, a low FC — CF 
score is associated with continuance in therapy. 
This relationship is not statistically significant 
(7° = 3.79, with 1 df; p< .10). 

The use of IQ score as a predictor. Since 
various studies [10, 13, 14] have shown a cor- 
relation between number of Rorschach re- 
sponses and intelligence, we believed it possible 
that the relationship between number of re- 
sponses and continuance in therapy only 
demonstrated that more intelligent patients are 
more likely to continue. To test this hy- 
pothesis, we computed the correlation between 
number of responses and continuance in thera- 
py, with IQ score partialed out. The tetra- 
choric correlation between R and continuance 
was + .55; but when IQ (as measured by 
verbal scale of the Wechsler-Bellevue) had 
been partialed out, the correlation dropped to 
+ .07. Apparently the relationship between R 
and continuance was entirely accounted for by 
their correlations with verbal IQ. The IQ 
score gave better predictions of continuance 
than the number of Rorschach responses. The 
biserial correlation between IQ and continu- 


Frank Auld, Jr. and Leonard D. Eron 


ance was + .71 (significant at .05 level). 
Since 1Q scores were not available for all the 
patients, these correlations were all computed 


from a sample of only 23 patients. 


Discussion 


Why did the Kotkov-Meadow formula fail 
to predict continuance in psychotherapy in our 
sample? Before we can answer this question, it 
is necessary to consider the reasons why cross- 
validation and validity-generalization studies 
are necessary. 

The need for cross validation and validity 
generalization. When a prediction formula is 
applied to the sample from which it has been 
derived, it is bound to work well. When ap- 
plied to a new sample, it often works less well. 
There are four reasons for this: 

1. A multiple correlation, like all correla- 
tion coefficients, is subject to sampling error. 
The sampling error of the multiple correlation 
coefficient is, however, greater than the sam- 
pling error of a simple correlation coefficient. 
Thorndike points out, “Even if the correlations 
with the criterion are all zero in the popula- 
tion, some prediction will be possible within 
the sample. ... The more predictor variables, 
the higher the multiple correlation which one 
may expect to derive from them, in the absence 
of any genuine relationship” [12, p. 203]. 
Since Kotkov and Meadow tried out about 30 
Rorschach scores before selecting 3 for their 
formula, their formula in effect is based on 30 
predictor variables, 27 of which are given zero 
weights and 3 of which are given positive 
weights. 

2. ‘Thorndike also reminds us that “in ad- 
dition to any true relationship that may exist 
between the test variables and the criterion in 
the parent population, a prediction based on a 
few chosen tests capitalizes on chance fluctu- 
ations in validities and intercorrelations of the 
variables in the specific sample. The 
smaller the number of variables selected to be 
weighted, the more premium is placed upon 
favorable chance fluctuations in those particu- 
lar variables retained and weighted. It must be 
anticipated, therefore, that the multiple cor- 
relation obtained by weighting only a picked 
few of the tests of a battery will show a marked 
shrinkage, greater than when the regression 
weights are based on all the tests” [12, pp. 


Rorschach Scores and Continuing Psychotherapy 


204-205]. Kotkov and Meadow gave greater- 
than-zero weights to only 3 variables. The 
shrinkage, therefore, is expected to be greater 
than it would be if they had retained all 30 
variables. Since, however, they probably gave 
zero weights to the other 27 variables because 
these variables would have had very small 
weights, the effect of selecting only 3 out of 
30 is probably much less important than the 
effect of trying out so many variables (the 
reason for shrinkage discussed in paragraph 1 
above). 

3. Quite apart from the two statistical 
reasons just discussed, a multiple correlation 
coefficient may shrink on cross validation be- 
cause the new sample does not resemble the 
old in every relevant characteristic. In this im- 
perfect world, successive samples can seldom 
be identical in every important way. There- 
fore it cannot be strictly true that successive 
samples are “drawn from the same popula- 
tion.” Mosier [9] has called attention to this 
problem. He deals with the purely statistical 
reasons for shrinkage in multiple correlation 
under the heading of “cross validation,” and 
with the shrinkage caused by changes in char- 
acteristics of the samples under the heading of 
“validity generalization.” He points out that 
studies on new and different samples are neces- 
sary to establish the limits of applicability of 
prediction formulas. 

4. ‘The likelihood that the positive findings 
of Kotkov and Meadow — or of any investi- 
gators — are partly due to chance fluctuations 
is increased by the fact that published studies 
are not a random sample of all studies done. 
Because of a feeling among psychologists that 
positive results are more worth while than 
negative findings, published studies contain an 
unusually high proportion of positive findings. 
No one knows how many studies had to be 
done to turn up the positive results that appear 
in the published studies. 

In the field of clinical psychology, cross- 
validation studies are relatively infrequent. 
Probably a prime reason for this is the great 
value placed by the scientific community on 
original contributions. Repetitions of past work 
are valued less than new work: this discourages 
cross-validating studies. Furthermore, clinical 
psychologists are likely to feel that any failure 
on their part to reproduce findings of a pioneer 


107 


worker may be attributed to a lack of clinical 
skill. This climate of opinion deters clinical 
psychologists who get negative results from 
publishing their findings. 

Exampies of shrinkage in multiple correla- 
tion. An example may serve to emphasize the 
importance of shrinkage in multiple correla- 
tion. Buhler, Buhler, and Lefever [1] worked 
out a “Basic Rorschach Score” to measure ad- 
justment and discriminate between persons 
who are well adjusted and persons who are 
poorly adjusted. Although it was not derived 
by the conventional multiple-regression pro- 
cedures, the Basic Rorschach Score is essential- 
ly a multiple-regression formula. The Basic 
Rorschach Score worked well on the groups 
from which it had been derived, but showed 
a marked drop in discriminating ability when 
tried out on other groups. In the original 
sample it discriminated normal subjects (mean 
score = 21) from neurotics (mean score = 8). 
But when the Buhlers tried the Basic Ror- 
schach Score on a new sample, it failed to dis- 


criminate nurses (mean score = 12) or other 
normal persons (mean score — 14) from 
hysterics (mean score = 12). The Buhlers 


then maintained that the nurses and other 
normal people of the new sample were after all 
not so normal as the original group of normal 
subjects. We believe, however, that the failure 
of the Basic Rorschach Score to discriminate in 
the new sample illustrates shrinkage in mul- 
tiple correlation. 

Kurtz [8] also reported a dramatic shrink- 
age of multiple correlation upon cross valida- 
tion. A list of 32 Rorschach signs correctly 
classified 79 out of 80 insurance agency man- 
agers as successful or unsuccessful — in the 
sample from which the sign list had been de- 
veloped. When the signs were applied to a 
new sample of agency managers, the correla- 
tion between sign score and success dropped to 
++ .02. 

Why did the Kotkov-Meadow formula fail 
in the New Haven sample? Since Kotkov and 
Meadow did attempt a cross validation of their 
findings — strictly speaking a validity-generali- 
zation study — and found that their formula 
also worked in the new sample, it seems un- 
likely that the failure of the formula when ap- 
plied to our sample can be entirely attributed 
to sampling error. Although this seems un- 








108 


likely, part of the loss in predictive efficiency 
may nevertheless be laid to this source of 
shrinkage. If the fact that the formula worked 
for them but net for us cannot be wholly ex- 
plained by sampling error, it can probably be 
explained by differences between the patients 
seen in the Boston VA Clinic and in New 
Haven Hospital. ‘The patients in our study, 
however, seem to be as much like those seen 
in the Boston clinic as patients in different 
clinics usually are. Therefore, we have dis- 
covered an important limitation to the gener- 
ality of Meadow’s formula. Our finding that 
the formula does not work in the New Haven 
sample thus illustrates the importance of the 
type of study that Mosier called “validity gen- 
eralization.”’ 

Alternative explanations. It is possible, of 
course, that a VA facility gets patients who 
differ from non-VA patients in motivation for 
psychotherapy. In a private-hospital clinic, the 
fee screens out those with only very slight 
motivation for treatment. The VA patient is 
treated free of charge; thus, in a VA facility, 
some patients with marginal motivation begin 
treatment. This difference in motivation may 
affect any attempt to predict continuance of 
patients in therapy. However, the fact that 
Kogers, Knauss, and Hammond [11], working 
in a VA clinic, were not able to predict per- 
severance in therapy from Rorschach scores, 
casts doubt on this interpretation. 

Another possible explanation of our results 
is that the treatment procedure is different or 
the skill of the therapists is different. These 
might have an effect on the applicability of a 
prediction formula. In both clinics, the treat- 
ment is described as psychotherapy, but this 
does not guarantee similarity of method. In 
both clinics, psychotherapy was carried out by 
therapists believed to be competent, but thera- 
pists in one clinic may have been much more 
skilled than those in the other. We cannot de- 
cide from the available information whether 
such differences might account for the differ- 
ences in findings. We can say, however, that if 
such differences in method and skill do affect 
the usefulness of the prediction formula, then 
the generality and usefulness of the formula is 
quite limited. 

Implications of this study. Our validity- 
generalization study shows that the Kotkov- 


Frank Auld, Jr. and Leonard D. Eron 


Meadow formula is not generally applicable 
as a method for predicting continuance in psy- 
chotherapy. It also calls in question these 
authors’ interpretation of their results. Kotkov 
and Meadow believed that patients who con- 
tinued in therapy did so because they had a 
greater capacity for bearing the anxiety aroused 
by psychotherapy. They thought that the FC — 
CF score was a measure of this capacity for 
bearing anxiety — as Rorschach workers 
would say, a measure of “intellectual control” 
of emotions. Our findings that the patients who 
quit have higher FC — CF scores would seem 
to show either that FC — CF is not a measure 
of intellectual control or that intellectual con- 
trol is not related to continuance in therapy. 
Our study does confirm one of Kotkov and 
Meadow’s findings. Like them, we find that 
the total number of responses (R) is associated 
with continuance in psychotherapy. The as- 
sociation is not continuous over all values of 
R; i.e., we do not find that the more responses 
a patient gives, the better chance there is he 
will stay in treatment. Rather, we find that 
he is more likely to stay in treatment if he 
gives at least a “sufficient” number of re- 
sponses, i.e., if he gives more than 18, which is 
the median number in the whole sample. One 
plausible interpretation of this is that giving a 
“sufficient” number of responses indicates a co- 
operative attitude and that the patient who is 
motivated to cooperate in the testing is also 
motivated to cooperate in the psychotherapy. 
Our finding that the Wechsler-Bellevue 
verbal score is, in our sample, a better predictor 
of persistence in therapy than the number of 
Rorschach responses suggests that this test may 
be more useful than the Rorschach for predict- 
ing continuance in therapy. Such a use of the 
Wechsler-Bellevue verbal scale should, how- 
ever, be supported by evidence from cross-valid- 
ation studies before it is accepted as valid. 
Built-in cross validation. The present au- 
thors urge the general adoption of an ingenious 
method of “built-in cross-validation” used with 
good success in the AAF Psychology Program. 
To use this method [Guilford, 3, p. 884], the 
investigator divides his original sample into 
two random halves. He develops a multiple- 
regression formula from the data of one-half 
of the sample and then applies it to the data in 
the other half. This method insures that error 


Rorschach Scores and Continuing Psychotherapy 


variance will not inflate the coefficient of mul- 
tiple correlation. It does not, however, take 
care of the shrinkage of multiple R attributable 
to differences in the characteristics of successive 
samples. Only a validity generalization study, 
making use of an entirely new group, can do 
this. 
Summary 


A trial on a new sample of patients of a 
formula developed by Kotkov and Meadow for 
predicting whether patients will continue in 
psychotherapy was made by the present au- 
thors. It was found that the formula, which 
used three common Rorschach scores: R, D%, 
and FC — CF, failed to predict which of a new 
sample of 33 psychiatric clinic patients would 
continue in psychotherapy and which would 
quit. Although one of the scores, total number 
of responses (R), was found to discriminate 
the two kinds of patients, it is felt that this is 
a measure of the patient’s motivation to co- 
operate or perhaps an index to his functioning 
intelligence. Both of these can be assessed by 
more appropriate techniques than the Ror- 
schach test. In fact, we found that all of the 
correlation between R and continuance can be 
accounted for by their correlations with the 
Wechsler-Bellevue Verbal IQ; and the latter 
test may be a much better predictor of persis- 
tence in therapy than any Rorschach score or 
combination of scores. 

The findings of this study thus demonstrate 
the need for cross-validation and validity-gen- 
eralization studies. Until such cross validation 
and validity generalization has been done, no 
confidence can be placed in the general appli- 
cability of any method of diagnosis or any 
formula for predicting behavior. 


Received September 2, 1952. 


10. 


11. 


12. 


13. 


14. 


109 


References 


Buhler, Charlotte, Buhler, K., & Letever, D.W. 
Rorschach standardization studies. No. 1; De- 
velopment of the Basic Rorschach Score. 
Angeles: Authors, 1948. 


Festinger, L. The significance of difference 
between means without reference to the fre- 


Los 


quency distribution function. Psychometrika, 
1946, 11, 97-105. 
Guilford, J. P. (Ed.) Printed classification 


tests. A.A.F. Aviation Psychology Program 
Research Report No. 5. Washington, D.C.: U. 
S. Government Printing Office, 1947. 


Kelly, E.L., & Fiske, D.W. The prediction of 


performance in clinical psychology. Ann Ar- 
bor: Univer. of Michigan Press, 1951. 
Klopfer, B., & Kelley, D. McG. The Ror- 


schach technique. Yonkers, N.Y.: World Book 
Company, 1946. 

Kotkov, B., & Meadow, A. Rorschach criteria 
for continuing group psychotherapy. Int. J. 
Group Psychother., 1952, 2, in press. 

Kotkov, B., & Meadow, A. Rorschach criteria 
for predicting continuation in individual psy- 
chotherapy. J. consult. Psychol., 1953, 17, 16-20. 
Kurtz, A. K. A research test of the Rorschach 
test. Personnel Psychol., 1948, 1, 41-51. 
Mosier, C. I. Problems and designs of cross- 
validation. Educ. psychol. Measmt, 1951, 11, 
5-11. 

Neff, W.S., & Lidz, T. Rorschach pattern of 
normal subjects of graded intelligence. J. proj. 
Tech., 1951, 15, 45-57. 

Rogers, L. S., Knauss, Joanne, & Hammond, 
K. R. Predicting continuation in therapy by 
means of the Rorschach test. J. consult. Psy- 
chol., 1951, 15, 368-371. 
Thorndike, R. L. Personnel 
York: Wiley, 1949. 

Wishner, J. Rorschach intellectual indicators 
in neurotics. Amer. J. Orthopsychiat., 1948, 18, 
265-279. 

Wittenborn, J. R. Certain Rorschach response 
categories and mental abilities. J. appl. Psy- 
chol., 1949, 33, 330-338. 


selection. New 








i of Consulting Psychology 
ol. 17, No. 2, 1953 


Rorschach Scoring Categories as Diagnostic 


“Signs” 


l 


Martin Berkowitz and Jacob Levine 


Veterans Administration Hospital, Newington, Conn. 


‘The use of certain scoring categories in the 
Rorschach for discriminating between the 
“normal,” neurotic, and the psychotic has con- 
siderable appeal, particularly for the clinician 
who is seeking a dependable screening tech- 
nique. As a clinical procedure, these Rorschach 
“‘signs’’ would have much to recommend them, 
if their validity in differential diagnosis were 
well established. But as yet few attempts have 
been made to determine on a proper statistical 
basis just how effectively these “signs” are able 
to discriminate between the major psychiatric 
groups. 


In their original study, Miale and Har- 
rower-Erickson [7] found that 9 such “signs” 
could discriminate between the neurotic and 
the “normal.” These were: (a) 25 or fewer 
responses; (4) not over 1 M; (c) more FM 
than M;; (d) color shock; (e) shade shock; 
(f) vefusal to respond on any card; (g) over 
50% F responses; (h) 4% over 50; (i) not 
over 1 FC. These investigators observed that 
the presence of 5 or more of these “signs” “sug- 
gests strongly the presence of psychoneurosis.”’ 
Harrower [3, 4] later confirmed these find- 
ings. 

The question arises whether or not these 
same “signs,” now identified as ‘neurotic 
signs,” are equally able to differentiate between 
the neurotic and psychotic. The difficulty in 
differentiating these two groups on a Clinical 
basis is not nearly so great as it is between the 
neurotic and the “normal.” Harrower-Erick- 
son states “... we have found in a preliminary 
study with 30 psychotic patients that while a 
greater percentage than in the case of normal 


1 Submitted with the approval of the Chief Medi- 
cal Officer, Veterans Administration, who assumes 
no responsibility for the opinions expressed here, 
which are those of the authors. 


controls will be shown to have over 5 signs, 
the number will not be nearly so high as in a 
group of neurotic patients. Moreover, when 
the incidence of the individual signs shown in 
the psychotic records is plotted against the 
normal or the neurotic, the curve will be found 
to have its own pattern in contradistinction to 
that of the neurotic or normal” [3, p. 113]. 
In view of the fact that these interesting find- 
ings are the result of a preliminary study with 
no data or statistics reported, it appears desir- 
able to repeat the study. 

The present study is therefore an attempt to 
test the validity of these 9 Rorschach scoring 
categories as differentiating ‘signs’ between 
psychoneurosis and psychosis. Further, an at- 
tempt will be made to determine whether or 
not other scoring categories, particularly F+% 
and number of Populars, are more sensitive 
differentiators than the others. Clinicians gen- 
erally use the latter two categories as important 
indicators of psychosis. On this basis, we 
might expect that F+% and number of Pop- 
ulars will discriminate in a statistically signifi- 
cant way between the psychotic and the non- 
psychotic. 

Two groups of psychiatric patients from two 
Veterans Administration hospitals comprised 
the subjects of the present study. One group 
consisted of 25 neurotic patients who were ran- 
domly selected from the open ward of the hos- 
pital. The diagnosis of psychoneurosis was 
made by staff psychiatrists after careful review 
of the clinical findings and a number of inter- 
views. The second group consisted of 25 psy- 
chotic patients who were also randomly select- 
ed from the closed ward of a second Veterans 
Administration hospital. All were clinically 
diagnosed as schizophrenic. 

The Rorschach protocols of both groups of 


110 


f 


Rorschach Scoring Categories as Diagnostic “Signs” 111 


patients were thoroughly reviewed to insure 
consistency of scoring. Color and shade shock 
were evaluated according to the criteria of 
Brosin and Fromm [2]. Klopfer’s classifica- 
tion of popular responses was used. Five or less 
popular responses was taken as a psychotic 
“sign” on the basis of Klopfer’s statement that 
“the use of 5 or more popular concepts seems 
to assure that the subject possesses capacity and 
interest in thinking along the same lines as 
other people in sufficient degree” [6, p. 216]. 
The cutting score for F+% as a psychotic 
“sign’’ was taken as 60 which Beck [1] states 
is the “critical minimum.” 

The differences in the number of neurotic 
and psychotic “signs” which were obtained 
from the present two samples were treated for 
significance by means of a test designed to 
measure the significance of the differences be- 
tween percentages [5]. The significance of the 
difference between the neurotic and the psy- 
chotic groups with respect to mean absolute fre- 
quency scores of these “signs” were determined 
by the use of the ¢ test. 


Table 1 


Comparison of “Neurotic Signs” between Neurotics 
and Psychotics in Terms of Percentages 








Total Signs Percentage 





° $ ° 2 = 8 
a e ® B ese 
Pr ae Ye Sr: 
1, 25 responses 
or less 20 14 80 56 N.S.* 
2. 1M 13 12 52 48 N.S. 
3. FM>M 11 13 44 52 N.S. 
4. Rejection 11 6 &@& 2a NS. 
5. F%>50 15 11 60 ah N.S. 
6. A%o>50 9 10 36 40 N.S. 
7. 17C 17 18 68 72 N.S. 
8. CS 16 16 64 «64 NSS. 
9. SS 12 19 48 76 05 


5 or more 
neurotic signs 18 Ss =o. Be 
*N.S.—not significant. 





The results of the present study are sum- 
marized in four tables. As can be seen in Table 
1, there are no statistically significant differ- 
ences between the neurotic and psychotic 
groups with regard to 8 of the 9 “neurotic 


signs.” With respect to the ninth “sign,” 
shade shock, the difference is significant at the 
05 level. 


Table 2 


Comparison of Mean Absolute Frequency Scores of 
“Neurotic Signs” between Neurotics and Psychotics 











Mean SD 


“Signs” 


Neurotic 
Psychotic 
Neurotic 

| ¢¢ 
Signi fi- 
cance of 
Difference 





~~ Psychotic 


1. Number of 21.08 26.00 16. 
Responses 


2. Number of M 1.72 3.04 1.48 3.27 1.81 N.S. 
3. Numberof FM 2.24 3.16 2.78 2.65 1.16 N.S. 
4. Number of 


+ 
— 

Vv 
iy 
Ss) 
pes 
o 
~s 
Z. 
S 
. 


5 


Rejections 76 68 1.07 1.46 .22 N.S. 
5. F% 56.28 48.96 21.55 20.71 1.20 N.S. 
6. A% 46.32 46.12 15.24 17.34 .40 N.S. 


7. Number of FC 1.16 .92 1.38 1.09 .67 N.S. 
Average Number 4.96 4.76 1.87 2.27 .33 N.S. 
of Signs 


* Degrees of freedom is 48 for all categories. 
+ N.S.—not significant. 





In Table 2 the mean absolute frequency 
scores also show no significant differences be- 
tween the two groups. Color shock and shade 
shock are not included in Table 2 because they 
are not quantified. 


Table 3 
Comparison of “Signs” of F+% of Less Than 60 
and Fewer than 5 Populars between Neurotics 
and Psychotics 











Total Signs Percentages 
er 9 2 ~ 
3 pe} Ss 2°s 
=) Sok 
“Signs” z ° 5 ro] | sé 
> z A 4 2. wax 
Zz me vA ~ nok 
F+%60* 3 14 12.5 58.33 .001 
5P 7 18 28 72 .002 





*N-=24. In both groups there was one case with an 
F% of zero. 


Table 3 shows that the neurotic group can 
be differentiated from the psychotic on the 
basis of the F+% less than 60, and fewer than 
5 Populars as “signs.” The significance of the 
difference is less than .001 and .002 respect- 
tively. 








112 


Table 4 
Comparison of F+% and Number of Populars 
Between Neurotics and Psychotics 











Mean SD 

© S ~ 3 Ss : 

~ ~ ° ~ 
“Signs” é Z A 4 S35 

3 > = > Gee 

oa nm 2 L£ ~ — 

Zz Pw Z Pw m 8A 
F+% 79.58 60.04 17.28 25.34 2.96* .01 
No. of P 4.88 3.84 1.63 1.41 2.36t .03 





* 46 degrees of freedom. 
+ 48 degrees of freedom. 


When we compare the average F+% and 
the average number of Populars of the two 
groups, we find the significance of the differ- 
ence between the two groups is less than .01 
and .03, respectively. These results substanti- 
ate those in Table 3. 

The results of the present study do not sup- 
port the use of the so-called “neurotic signs” 
to differentiate the neurotic from the psychotic 
patient. The total number of such “signs” does 
not differentiate these two groups. We find 
that five or more of these “neurotic signs” is 
above average for both neurotic and psychotic 
groups. And if we plot the incidence of in- 
dividual “signs” for both groups, we do not 
find a characteristic pattern for each, as sug- 
gested by Harrower-Erickson [3]. For pur- 
poses of screening or differential diagnosis the 
use of this procedure therefore appears to be 
questionable. 

We further fiad that of the scoring cate- 
gories tested only three show a statistically sig- 
nificant difference between the neurotic and 
psychotic groups. The psychotics consistently 
showed greater shade shock, lower F+%, and 
fewer Populars than did the neurotics. The 
results show that the established cutting scores 
of 60 for F+% and 5 for Populars are effect- 


Martin Berkowitz and Jacob Levine 


tive. These three categories might prove use- 
ful as ‘‘signs’’ for purposes of differential diag- 
nosis. 
Summary 

‘Two groups of 25 psychiatric patients each, 
one neurotic the other psychotic, were com- 
pared in the performance on the Rorschach 
with respect to certain scoring categories which 
are used as “signs.’’ Only one of the nine cate- 
gories commonly referred to as “neuratic 
signs’ was found to be statistically different for 
the two groups. The use of five or more of 
these “neurotic signs” to differentiate between 
the neurotic and psychotic was not substanti- 
ated. A statistically significant difference be- 
tween the two groups was found with two 
other scoring categories, f+9% and number of 
Populars. 


Received September 10, 1952. 


References 


1. Beck, S. J. Rorschach’s test. Vol. Il. A variety 
of personality pictures. New York: Grune & 
Stratton, 1946. 

2. Brosin, H. W., & Fromm, E. O. Rorschach and 
color blindness. Rorschach Res. Exch., 1940, 4, 
39-70. 

3. Harrower-Erickson, Mollie R. The value of the 
so-called “neurotic signs.” Rorschach Res. Exch., 
1942, 6, 109-114. 

4. Harrower-Erickson, Mollie R. Diagnosis of 
psychogenic factors in disease by means of the 
Rorschach method. Psychiat. Quart., 1943, 17, 
57-66. 

5. Johnson, P. O. Statistical methods in research. 
New York: Prentice Hall, 1949. 

6. Klopfer, B.. & Kelley, D. M. The Rorschach 
technique. Yonkers, N.Y.: World Book Co., 
1942. 

7. Miale, Florence R., & Harrower-Erickson, Mol- 
lie R. Personality structure in the psychoneuro- 
ses. Rorschach Res. Exch., 1940, 4, 71-74. 


gue of Consulting Psychology 
ol. 17, No. 2, 1953 


Group Psychotherapy With Acutely Disturbed 
Psychotic Patients ” 


Herman Feifel 
and Arnold D. Schwartz’ 


Winter Veterans Administration Hospital 


Some of the main currents of present think- 
ing with regard to maladjustment in people 
stress the idea that emotional disturbances 
have part of their genesis in interpersonal re- 
lations, and, consequently, that interaction 
among individuals can serve as a basic thera- 
peutic agent. This concept, together with the 
desire to reach more patients than individual 
psychotherapy permits, has resulted in an in- 
creasing use of the group psychotherapy meth- 
od of approach. The method has been ap- 
plied with normals, neurotics, and different 
types of psychotic patients with varying de- 
grees of success. However, a review of the 
literature points up the fragmentary knowl- 
edge and paucity of information existing in 
the field of group psychotherapy with acutely 
disturbed psychotic patients. 

This paper is a report of an exploratory 
study of the effects of an open-end type of 
group therapy on a ward of disturbed psychotic 
patients in a Veterans Administration Hos- 
pital. Since our orientation considers psy- 
chosis to be a form of defense reaction to situa- 
tions of conflict, our efforts were directed 
toward accelerating the development of self- 
esteem and strengthening the ability of the pa- 
tients to cope with factors of social reality. 
Available data on a patient population similar 
to the one with which we worked enabled us 
to make broad statistical comparisons as to the 
relative effectiveness of the group psychothera- 
py procedure. 


1A portion of this paper was read at the Amer. 
Psychol. Ass., Washington, D.C., September, 1952. 

2From Winter Veterans Administration Hospital, 
Topeka, Kansas. 

’Formerly a Fellow of the Menninger School of 
Psychiatry, Topeka, Kansas. Now at Langley-Por- 
ter Clinic, San Francisco, Calif. 


Method 

Subjects. One of the authors was assigned 
as resident psychiatrist to a locked ward on the 
Acute Section of Winter Veterans Administra- 
tion Hospital for a period of six months. Dur- 
ing the first three months, he used a program 
of milieu therapy, i.e., occupational therapy, 
corrective activities, special service functions 
including gym, attendance at movies and 
dances, etc. Some of the patients, in this 
interim, also received hydrotherapy, electro- 
convulsive therapy, and individual psycho- 
therapy. During the second three months, the 
ward program was identical for the patients, 
with one exception — the addition of group 
psychotherapy sessions. 

There were 21 beds on the ward and these 
were occupied by 100 patients during the six- 
month period being considered, 51 during the 
first three months, and 49 during the second 
three months. Since there was an overlapping 
of certain patients from the first to second 


Table 1 


Distribution of Diagnostic Categories in 
Experimental and Control Groups 











Category Experi- Con- 
mental trol 
Schizophrenic, paranoid 14 14 
Schizophrenic reaction, 

unclassified 8 10 
Manic-depressive 3 1 
Involutional melancholia 1 1 
Alcoholic psychosis 2 2 
Organic psychosis 2 1 
Depression, unclassified 2 1 
Psychotic depression - 1 
Character-disorder 2 3 
34 34 





113 








114 


Table 2 
Means and Standard Deviations for Intelligence, 
Age, and Education of Experimental 
and Control Groups 








Control 








Experimental 
Measure group group 
(N34) (N34) 

10 

Mean 104.3 102.4 

SD 5.2 4.2 
Age 

Mean 35.4 34.2 

SD 11.8 12.6 
Education 

Mean 10.4 10.1 

SD 2.6 3.0 

Table 3 


Marital, Urban-Rural, and Race Status of 
Experimental and Control Groups 








Control 





Sestus Experimental 
" N % N % 
Single 12 35 14 41 
Married 12 35 12 35 
Divorced 10 30 8 24 
34 100 34 100 
Urban 16 47 19 56 
Rural 18 53 15 44 
34 100 34 100 
White 33 97 32 94 
Nonwhite 1 3 2 6 
34 100 34 100 





three-month period, and because we considered 
only those patients as part of the experimental 
group who had attended a minimum of eight 
group psychotherapy sessions, our experimental 
group comprised 34 patients which was, fortui- 
tously, the number of control patients available 
to us from the first group. 

Tables 1, 2, and 3 present the relevant in- 
formation concerning the background of both 
groups. Table 1 indicates that both groups, 
composed mainly of World War II male 
veterans, in various stages of remission* were 


4It should be kept in mind that although the ma- 
jority of our patients were acute psychotics in the 
sense that this was their first or second “break,” a 
few were now in the exacerbatory stage of a long 
illness, e.g., one of our manic-depressive patients 
had been ill sporadically for twenty years. 


Herman Feifel and Arnold D. Schwartz 


well matched in terms of their diagnostic 
classification. The paranoid schizophrenic and 
schizophrenic reaction, unclassified, types of 
category dominate both the experimental and 
control groups. 

The findings in Table 2 show that the 
groups are also well matched on the variables 
of intelligence, age, and education. Both are 
slightly above average in intelligence, have an 
approximate mean age of 35 years (the age 
range of the experimental group extended 
from 21 through 61 years of age; that of the 
control group from 22 through 64 years of 
age), and possess an educational level the 
equivalent of approximately two completed 
years of high school. 

The data in Table 3 illustrate the general 
similarity of both groups with regard to 
marital situation, urban-rural background, and 
racial status. The experimental group has a 
few less single individuals and a few more 
divorced persons, and more people with a 
rural background than does the control group. 
Except for one Negro patient in the experi- 
mental group and two in the control group, 
all the patients were white and American born. 

Analysis of the occupational background of 
the revealed that, in both, skilled 
workers predominated with a sprinkling of 
farmers, professionals, and semiskilled persons. 

In both groups, approximately 80 per cent 
of the patients were admitted from outside the 
hospital, the remainder coming to the ward 
from other parts of the hospital. 

The duration of present hospitalization for 
the majority of patients in the experimental 
and control groups was usually less than two 
months, although a small minority in both 
groups had been in longer, and frequently as 
little as two or three weeks. As has been 
mentioned, a minority in each group received 
ECT, hydrotherapy, and individual psycho- 
therapy, in addition to the general milieu ther- 
apy program and group psychotherapy activi- 
ties of the experimental group. 

Procedure. The meetings were for one hour 
and took place twice weekly in an informal 
setting in the ward’s dayroom. In all, 20 ses- 
sions were held with the experimental group. 
Chairs were arranged in a circular pattern, 
the two therapists sitting next to each other 
and the patients being free to sit in any of the 


groups 


Group Psychotherapy with Acutely Disturbed Psychotics 115 


chairs, on card tables, or on the floor, if they 
so desired. Attendance at the meetings was 
explicitly made voluntary. The group itself 
was of an open-end type, i.e., while old 
members of the group were leaving for other 
wards, trial visits home, or discharge from the 
hospital, new additions to the ward were al- 
lowed to participate in the meetings as soon as 
they felt so inclined. Because of the turnover 
of patients, our last session had only three of 
the original participating patients. 


The group structure was of a permissive 
nature, i.e., the patients were free to talk about 
anything they desired, sit wherever they 
wished, walk out of the meeting if they felt 
like it, etc. However, since these were acutely 
disturbed individuals, one restrictive aspect 
was introduced. The patients were told that 
no physical assaultiveness or self-destructive 
behavior would be permitted during the group 
sessions. 

In the matter of recording the sessions we 
considered various alternatives: wire or tape 
recording, a secretary using shorthand, or 
taking notes ourselves. We decided on the 
last course since it permitted us to secure 
verbatim notes, as well as get a record of 
feeling-tones and gestures which might have 
been overlooked if only verbal productions had 
been recorded. It also allowed us to note, more 
specifically, the therapists’ reactions to the pa- 
tients as well as their behavior toward the 
therapists. 


Therapists. The two therapists consisted of 
the ward doctor and section psychologist who 
alternated in the roles of group leader and re- 
corder-observer. The orientation of both was 
to create a friendly and noncritical atmos- 
phere — to foster the patients’ introducing as 
much of the material as possible and their lead- 
ing the discussions. The therapists did become 
active under certain conditions, e.g., to help a 
patient enter into the discussion and to aid 
him express himself, to promote group inter- 
action, etc. The approach can best be cate- 
gorized as the “analytic-investigative” orienta- 
tion described by Standish and Semrad [5]. 

Rating of improvement. Judgment on the 
improvement or nonimprovement of patients 
was made by the ward doctor, Chief of the 
Acute Section, and Assistant Chief of the 
Acute Section. At the time of rating the pa- 


tients, none of the raters were aware that their 
judgments would be used as a criterion in 
evaluating the effectiveness of group psycho- 
therapy activities. Although their judgments 
were made independently, they were probably 
influenced by previous mutual discussions con- 
cerning the hospital needs of the patients. 
Nevertheless, the very strong agreement with 
respect to the patients’ status in practically all 
of the cases, we feel, tends to make the role 
of contaminatory bias less important than it 
might otherwise seem. 
Results 

Group Process. The most striking aspect of 
the experimental group’s development was its 
maturation from a gathering of individuals, 
most of whose members talked unintelligibly, 
to a cohesively interacting group animatedly 
discussing common problems and drawing upon 
one another for help. 

The first meetings were characterized by a 
good deal of open hallucinating, unintelligible 
talk, and expressions of autistic thinking, e.g., 
one of the more disturbed patients stated his 
concern over what doctors do to patients as 
he incoherently discussed experiments dealing 
with the mixing of animal and human blood. 
Another patient, who had been homicidally 
assaultive, mentioned, with a smile, that he 
had many friends whose heads were going to 
adorn fence posts and that his mother would 
not escape his wrath either. Testing the new 
situation was obvious, e.g., one of the patients 
stated his feeling that the aides were “no 
damn good, a bunch of incompetents who only 
worked here because they couldn’t do any- 
thing else” ; another asked, “If we speak to you, 
won't we stay here longer?” Participation of 
most of the members during these early ses- 
sions was minimal and talk was directed to the 
therapist rather than to other group members. 
Also, a good deal of physical activity was 
noticeable ; some of the patients walked around 
the room, others made frequent trips to the 
water fountain, and some walked out of the 
dayroom altogether. 

As the sessions progressed, a few of the 
patients began griping over personal incidents 
to the therapist, still in a kind of private con- 
versation setup, “When am I going to get my 
laundry back?” “Why can’t I go to the typing 
class with the open ward?” etc. Little by little, 





116 Herman Feifel and Arnold D. Schwartz 


they became interested in how other group 
members were reacting to what they said, e.g., 
one of the patients, telling the therapist that 
he felt responsible for his father’s death, turned 
to the group asking for their judgment about 
what had happened. Gradually, more and 
more group members started voicing their 
feelings. Soon the participants were discussing 
common personal problems, e.g., “Will my 
mind be affected because of this sickness I’ve 
had?” “Should I tell my employer that I’ve 
been in a mental hospital ?”’ as well as common 
ward problems, e.g., “What should we do 
with those people who spit into the water 
fountain?” “How about not having any piano 
playing when some of us are trying to sleep?” 
etc. Exchange of ideas and feelings became 
frequent and often heated. No longer was 
the group composed of “passive spectators,” 
but it had become one in which a definite 
“we” feeling predominated and group members 
had developed feelings of responsibility toward 
each other, e.g., one of the patients who had a 
large farm offered another member of the 
group a job on it if he couldn’t find something 
in his own line after he got out of the hospital. 

A definite by-product of the meetings was 
the sizeable number of patients who ap- 
proached both therapists after the sessions re- 
questing individual interviews concerning ma- 
terial they had previously repressed or couldn’t 
talk about. 

The general developmental process of our 
group appears to be similar in many respects to 
Frank’s [2] findings with chronic schizo- 
phrenic patients and to those of Luchins [4] 
with open ward patients in the NP section of 
a military hospital. 

As to the themes dominating the discussions 
—the early sessions dealt with various ex- 
pressions of guilt feelings over attitudes and 
actions, fancied and real, toward close family 
members. The middle sessions brought to the 
fore conflicts associated with sexual adjust- 
ment, particularly their own sex role and 
identity, and the problem of relating them- 
selves to women, as well as concern over in- 
ability to control aggressive impulses. The 
last sessions focused on the factors of personal 
responsibility for their illness and getting 
well, the stigma of mental illness, and the 
problems of adjusting outside the hospital. 


Some of the major facets that the group 
psychotherapy experience seemed to provide 
and accelerate for most of the group members 
were: 

Catharsis — allowing the expression of 
anger, resentment, and various negative feel- 
ings against the hospital and aides, and finally 
against the ward doctor and therapists. For 
example, one of the more disturbed patients 
assaulted one of the therapists during an early 
session. The question of what prompted the 
patient to do this and the problem of punish- 
ment were brought up immediately and dis- 
cussed. The group was informed that the pa- 
tient would be sent to a more secure locked 
ward so as to protect others from the patient, 
and the patient from his own fears. The group 
took three to four meetings to work through 
their own anxieties before the matter was 
dropped. The therapists noted that most of 
the patients identified themselves both with 
the therapist who had been assaulted, as they 
individually expressed how they would have 
handled the situation if the patient had struck 
one of them, and with the assaultive patient, 
as they speculated about what should happen 
to, and what was happening to the patient who 
could no longer control his aggressive impulses. 
It was also surprising to note how much of the 
group interaction was based on “rubbing” and 
“sniping” at each other. 

Communication — increasing the patients’ 
ability to communicate with others and getting 
them to the point where they could formulate 
and define their problems. An instance of this 
is where one of the patients talking incoher- 
ently was soon interrupted by another who 
said, “How can the doctor understand you 
when I can’t?” 

Leadership — despite the moving in and out 
of members and the general flexibility of the 
group, there was always a core of stimulating 
participants who “carried the ball.” They took 
the initiative, many times, in responding to the 
queries directed to the therapists and in or- 
ganizing the discussions. It was evident, too, 
that the assumption of group leadership and 
participation was independent of the age of the 
patient and based more on individual person- 
ality factors. 

Insight — many of the group members 
came to realize that the conflicts they had were 

















a 





Group Psychotherapy with Acutely Disturbed Psychotics 117 


of their own making and not just due to “bad 
luck,” “members of their family,” or “society’s 
fault.” They saw that they would have to ad- 
just themselves to other people and not the 
whole world to them, and that neither medi- 
cine, the hospital, nor the therapists could help 
them unless they tried to help themselves. 
This is aptly illustrated by a conclusion the 
group came to at the end of a discussion con- 
cerning ways of handling the international 
situation, ‘““‘How can we talk about changing 
others when we can’t even change ourselves?” 


Control — strengthening of ego factors was 
definitely apparent in the increased ability of 
the group members to direct and respond to 
strong verbal criticism without being over- 
whelmed, in their growing ability to identify 
with other members, and in improved reality 
orientation. 

It should be remembered that the group 
process described above was by no means a 
linear development but rather indicates the 
general direction of the group. Also, the find- 
ings were characteristic of the group as a 
whole and do not imply that each group 
member maintained this general progress. As 
a matter of fact, some members of the group 
became even more disturbed during the period 
of group psychotherapy. 

Quantitative comparison between experi- 
mental and control groups. In terms of broad 
statistical comparison, Table 4 shows the hos- 
pital status, after three months on the ward, 
of the experimental and control groups. It is 


Table 4 


Hospital Status after Three Months on the Ward 
of Experimental and Control Groups 











Status Experimental Control 
N %o N @&% 
More 
Disturbed* 4 11 7 21 
No 
Improvement 6 18 10 29 
Improved} 24 71 17 50 
34 100 34 100 





*More Disturbed = transferred to a closed ward with 
greater security precautions. 


tImproved = discharged from the hospital, left the 
hospital on trial visit, or transferred to an open ward. 


evident that more of the patients who had the 
benefit of group psychotherapy improved and 
less became more disturbed than the group 
which had not received psychotherapy. In the 
experimental group, 24 patients showed im- 
provement as against 17 patients in the control 
group, and only 4 patients in the experimental 
group became more disturbed as against 7 pa- 
tients in the control group. No improvement 
was indicated in 6 patients in the experimental 
group and in 10 patients in the contro! group. 
A chi square was calculated combining the 
more disturbed and no improvement patients 
into one “no improvement” category. Yates’s 
correction for continuity was applied (1 df). 
Table 5 indicates the y? to be 2.22 with a p 
value of 15 per cent. While this is not very 
significant, the trend is in favor of the experi- 


Table 5 
Chi-Square Analysis of Table 4 Combining More 
Disturbed and No Improvement Categories 











Experi- 
Status mental Control 

fo fe fo fe df x? p 
No 
Improve- 
ment 10 (13.5) 17 (13.5) 

1 @2e 15 

Improve- 
ment 24 (20.5) 17 (20.5) 





mental group. In addition, analysis of the 
ward records of both groups indicated that less 
medicinal and physical sedation, i.e., barbitu- 
rates, hydrotherapy, packs, etc., was necessary 
for the group psychotherapy patients. A sam- 
pling follow-up of members from both groups, 
six months after the present study, revealed 
that a similar status to the one indicated in 
Table 4 existed in the improved-disturbed ra- 
tio in the experimental and control groups. 
Unfortunately, we did not have as a control 
a group that had received no therapy at all so 
that the effects of natural remission could be 
evaluated relative to the effectiveness of our 
milieu and milieu plus group psychotherapy 
groups. 

Qualitative differences between experimental 
and control groups. Above and beyond the 
quantitative findings, the authors were strong- 
ly impressed by the qualitative differences be- 


tween the groups. 





118 


Relationships between patients in the ex- 
perimental group were more friendly and easy. 
The content of “gripe sessions,” held once 
weekly, changed from individual complaints 
to questions affecting the group. Instead of 
the usual unproductive griping, these sessions 
were utilized by the patients to make their 
ward situation more tolerable and pleasant. 
Where, at first, the ward doctor led these 
meetings, one month after the group psycho- 
therapy meetings had been started, the group 
elected their own spokesman to conduct the 
meetings. They appointed a secretary who 
took notes and informed the group of sug- 
gestions brought up during the week, action 
taken on previous suggestions, new members 
admitted to the ward, etc. They chose an 
athletic chairman who took the lead in inter- 
ward volleyball and intraward bowling, etc. 
In effect, the ward doctor became relegated 
to a position of advisor to the group. The 
patients were more than ordinarily reactive to 
how others viewed them, e.g., one of them 
found it very difficult to justify his claim that 
he was in the hospital for a “sprained 
sacroiliac’ when other patients mentioned that 
it didn’t seem to bother him when he was 
playing volleyball or bowling. Another pa- 
tient found it increasingly hard to maintain a 
supercilious air when several patients brought 
up the point that it seemed strange to them 
that with all his education, he was on a locked 
ward just as the rest of them. Further, it was 
seen that although a patient could feel free to 
“act crazy” for the doctors, nurse, or aide, he 
was less inclined to do so before his fellow 
patients, with the result that psychotic behavior 
and garbled language decreased. On one oc- 
casion the ward patients took to task one of 
the patients who was always putting up a 
“phony front” before ward personnel. As a 
result of the group pressure, the patient’s tall 
tales and protests disappeared after a few days. 
It was apparent that, in many instances, the 
opinions and attitudes of the group were more 
effective instruments in changing behavior of 
patients on the ward than were the activities 
of the ward doctor and ward personnel. 

From a previous attitude of unconcern 
toward the ward, the patients in the experi- 
mental group developed a definite interest in 
their ward surroundings. Problems of ward 





Herman Feifel and Arnold D. Schwartz 


temperature, ventilation, cleanliness, etc., be- 
came matters of import. During a meeting of 
the group, one of the patients brought up the 
point that the windows and curtains were 
filthy and hadn’t been cleaned in the past six 
months, and that it probably would take an- 
other six months to get them cleaned through 
the usual channels of red tape. The subject 
was temporarily dropped. However, a few 
days later, the patients took the initiative in 
asking the nurse and aides if they could clean 
the windows and curtains. The result was a 
joint effort in which both ward personnel and 
patients participated. 

The ward doctor found his own morale im- 
proving and his interest in the individual pa- 
tient increasing when the patients began to 
show some change after group therapy had 
been in progress for a short while. As delu- 
sions, hallucinations, and psychotic talk de- 
creased, the doctor felt more at ease on his 
ward rounds and in interviews, both in dis- 
cussing the patient’s conflicts and in attempt- 
ing to formulate for himself the dynamic as- 
pects of the problems they presented. The 
patients themselves also evidenced greater free- 
dom in bringing up meaningful material. In 
the general permissive atmosphere generated by 
the group psychotherapy sessions, both patients 
and doctor were better able to discuss the anxi- 
ety attending the patient’s ambivalence con- 
cerning his leaving or remaining in the hos- 
pital, as well as his concrete plans for posthos- 
pital adjustment. 

The meetings also had an effect on the re- 
lationship between ward personnel and the pa- 
tients. In one of the “gripe sessions,” about 
two months after the inception of group thera- 
py, the patients asked why they weren’t 
allowed out-of-doors in the yard when the 
weather was good. An aide replied that he 
could not watch all of the patients when he 
was the only one on ward duty and that several 
patients might try to elope. A patient sug- 
gested that the group itself take responsibility 
for informing the aide of any attempted elope- 
ment. A few patients stated that they didn’t 
want such responsibility. Further discussion 
ensued and as a result of a majority vote, the 
entire group decided to take responsibility. 
Future events showed that they held up their 
end of the bargain, informing the aide on 





Group Psychotherapy with Acutely Disturbed Psychotics 


several occasions when one or two of the more 
disturbed patients attempted to get “lost” from 
the group when going to or returning from 
a group activity. The patients were able to 
form realistic relationships with ward person- 
nel on the basis of the kind of individuals they 
were rather than on the basis of what they 
thought the ward personnel represented. For 
example, they complained about one specific 
aide being lazy because he did not help them 
with their food trays as he was supposed to do, 
an opinion which was shared by most of the 
staff. 

The sessions also had their effect on relations 
between the ward doctor and ward personnel. 
Before the group therapy sessions with the pa- 
tients were begun, ward personnel brought up 
few problems of their own at their weekly 
meetings with the doctor. After the group psy- 
chotherapy meetings were begun, a noticeable 
change took place in their interest. Nurses 
and aides began to ask more questions about 
the patients and to bring up various problems 
they had in dealing with them. Sharpening of 
interest was indicated by a developing concern 
over the proper attitudes they should adopt 
toward various patients, particularly with re- 
gard to handling their anxieties by means other 
than the usual ones of sedation. 

In summary then, perhaps more important 
than the quantitative improvement in the 
group psychotherapy patients, in this instance, 
was the /evel and quality of their improvement 
and its concomitant healthy effects on the func- 
tioning of the ward doctor and ward personnel. 


Discussion 


A characteristic aspect of the group psycho- 
therapy sessions was the manner in which it 
elicited material which had not come out pre- 
viously. We think this was due to the group 
structure setting which stimulated expression 
of problems and conflicts to a group of peers 
also having similar emotional difficulties. Be- 
cause of this aspect, group psychotherapy can 
often serve as a springboard and valuable ad- 
junct for individual psychotherapy in its grap- 
pling with the more deep-seated emotional 
problems of the patients. 

Another characteristic finding was the 
“proving ground,” so to speak, that the sessions 
offered to most of the patients before they left 


119 


the ward or hospital, for testing and convinc- 
ing themselves that they would be able to 
handle themselves in a less “protected” en- 
vironment. 

We were definitely impressed at the move- 
ment of the group as a whole, despite its open- 
end structure with the constant moving out of 
old members and the incoming of new ones. 
It is quite conceivable that positive results 
might have been intensified had the group been 
a closed one; the point remains, nevertheless, 
that forward changes did occur despite 
changes in the membership of the group. 

Another revealing result relating to the 
homogeneity of groups was our experience that 
the age differences, per se, of our patients were 
of little consequence. If anything, the age dif- 
ferences, at times, tended to bring perspective 
to many of the issues discussed. We do not 
imply that the age factor cannot be a signifi- 
cant one, but rather that in a group of veter- 
ans who have similar emotional problems and 
past experiences, and interests in common, it 
does not appear to be as important as in other 
patients. 

The sessions emphasized that far from being 
seclusive and withdrawn, necessarily, our dis- 
turbed patients showed marked sensitivity and 
reactiveness to other group members. Marked 
changes in personality were not too obvious in 
most of the group, but what was patent was 
the gradual disappearance of asocial tendencies, 
increased reality awareness, and a reorienta- 
tion of attitudes, e.g., “We've got to change, 
not the outside world.” Part of the patient’s 
pathology, as Breckir [1] has correctly 
stressed, disappears when he becomes an in- 
teracting group member. 

As to technique— we both had to guard 
against getting too involved with content ma- 
terial. Much better results were secured when 
we focused on the feelings being expressed by 
the patients. In attempting to catalyze the 
discussions, we tried, as much as possible, to 
make use of the patients’ productions and 
level of discussion as long as we felt comfort- 
able in doing so. We did not confront the pa- 
tients with past behavior or case-record ma- 
terial except where they themselves brought 
them up in discussion or by symptomatic be- 
havior during the meetings. 

We thought that our sitting next to one 





120 


Herman Feifel and 


another enabled the patients to equate us and 
helped them see that each of us assumed the 
same role at alternate meetings. By leaving 
the seating arrangements up to the individual 
patient, a shy patient could progress at his own 
speed from the periphery of the group (or from 
the floor behind a chair) to the center of the 
group next to the leader. Further, a patient 
who was angry could express his feeling by 
walking out of the meeting, or by moving to 
the periphery — a move which lent itself to 
discussion. 

We felt that having two therapists for the 
group was a decided advantage. At the be- 
ginning, particularly, it provided supports that 
enabled us to weather unpredictable outburts 
of aggression. The second therapist functioned 
as an “observer” as well as “recorder” in the 
sessions he did not conduct. We found this 
viewpoint to be very helpful during discus- 
sions between ourselves dealing with the ses- 
sions in pointing out transference and counter- 
transference phenomena, and guarding against 
blind alleys. One thing that was specifically 
noticeable in the early sessions was the con- 
ceptualization of the therapists by the patients 
as parent surrogates with consequent attempts 
to play one off against the other, utilizing the 
wedge that one was the ward doctor and the 
other a “visiting doctor.” When the patient 
realized that this strategy would not work, it 
disappeared. We also discovered, as Hayward, 
Peters, and Taylor [3] have pointed out, that 
the presence of more than one therapist 
allowed more targets to the patients for ex- 
pressing ambivalent feelings, and permitted a 
greater therapeutic force to be exerted. 

One of the major questions that comes to 
mind as a result of this experience is the role 
played by the factors of natural remission and 
the sheer “attention” aspect of group psycho- 
therapy as distinct from the procedure itself. 
Our study obviously does not permit any defi- 
nite answers. It is quite conceivable that had 
our experimental group received the benefit, 
let us say, of a picnic once a week instead of 
the group psychotherapy sessions, the results 
would have been the same or, perhaps, even 
better. Nevertheless, we think that we can 
legitimately state that “attention” of the group 
psychotherapy type appears potentially as ther- 
apeutically helpful. 





Arnold D. Schwartz 


We found it difficult to evaluate the effect 
of concurrent therapies that specific patients 
were receiving; also to assess properly the re- 
lation between time spent in the sessions and 
the therapeutic results obtained since the pa- 
tients in our group ranged in participation 
from 8 through 20 sessions. 

It is quite apparent that our study poses 
more questions than it answers. Research in- 
volving more controlled studies of process, ob- 
jective evaluation of therapeutic effectiveness, 
as well as extended time-series measurements 
is a necessity before we can be certain of the 
validity of any of the observations already 
made. Some of the main areas that demand 
future attention are the comparative efficacy 
of the open-end and closed type of groups, op- 
tional versus set topics, the relative effective- 
ness of group psychotherapy with different 
types of patient populations, dynamics of the 
group psychotherapy process and its relation 
to what occurs in individual psychotherapy, 
multi-therapists in group psychotherapy, and 
as has been mentioned previously, the “atten- 
tion” aspect of group psychotherapy, and the 
problem of natural remission. One mode of 
approach, not employed by us, that could be 
helpful in shedding light on some of these 
questions would be the use before and after 
treatment of various measures of personality 
assessment, like the Rorschach, Q technique, 
etc. 

Summary 

1. ‘I'wenty group psychotherapy sessions of 
an open-ended structure were held with acute- 
ly disturbed closed-ward patients in an NP 
veterans hospital. —They were diagnosed mainly 
as paranoid schizophrenics and schizophrenic 
reaction, unclassified. Two therapists, one of 
whom was the ward doctor, alternated in the 
roles of group leader and recorder-observer. 

2. Despite the continued entrance of new 
patients, the group, as a whole, moved from 
autistic thinking and individual preoccupations 
to concern with common problems and more 
realistic social relationships. 

3. The main themes dominating the discus- 
sions were: (a) guilt feelings toward close rel- 
atives, (6) sex adjustment problems, (c) con- 
trol over aggressive feelings, (d) personal re- 
sponsibility for illness, (e) the stigma of 


mental illness, and (f) adjustment outside the 





Group Psychotherapy with Acutely Disturbed Psychotics 


hospital. 

4. In group interaction, the age factor, per 
se, was less important seemingly than the 
similarity of problems and commonness of 
background in the patients. 

5. The sessions were helpful in eliciting 
material not easily secured otherwise. 

6. The patients appeared to give up part of 
their pathology in becoming interacting mem- 
bers of the group. 

7. The presence of more than one therapist 
probably facilitated the therapeutic process. 

8. The method of group psychotherapy is 
adaptable to treating acutely disturbed schizo- 
phrenics. 

9. A broad statistical comparison of the 
group psychotherapy patients was made with 
another group of patients, similar in back- 
ground and previously on the same ward with 
the same physician and an identical treatment 
program except for the group psychotherapy. 
The group psychotherapy patients showed 
more improvement. However, a more impres- 
sive finding was the quality of the improve- 
ment and its reflection in general ward activ- 


121 


ities and relations with ward personnel. 

10. Areas for further research were indi- 
cated as well as the necessity for more accu- 
rate and controlled studies to substantiate the 
validity of the present observations. 


Received August 27, 1952. 


References 


1. Breckir, N. J. Hospital orientation and training 
program for group psychotherapy of schizo- 
phrenic patients. Psychiat. Quart., 1950, 24, 130- 
143. 

2. Frank, J. D. Group psychotherapy with chronic 
hospitalized schizophrenics. In E. B. Brody & 
F. C. Redlich (Eds.), Psychotherapy with 
schizophrer.cs. New York: International Univer. 
Press, Inc., 1952. Pp. 216-230. 

3. Hayward, M. L., Peters, J. J., & Taylor, J. E. 
Some values of the use of multiple therapists in 
the treatment of the psychoses. Psychiat. Quart., 
1952, 26, 244-249. 

4. Luchins, A. S. Group structures in group psy- 
chotherapy. J. clin. Psychol., 1947, 3, 269-273. 

5. Standish, C. T., & Semrad, E. V. 
chotherapy with psychotics. J. 
Wk, 1951, 20, 143-150. 


Group psy- 
psychiat. soc. 











on of Consulting Psychology 
ol. 17, No. 2, 1953 


The Treatment of Delayed Speech by Client- 
Centered Therapy 


Henry J. Dupont 
George Peabody College 


Theodore Landsman 
Vanderbilt University 


and Milton Valentine 
Stanford University 


One of the speech problems dealt with by 
speech therapists is called “baby talk,” “re- 
tarded speech,” or “delayed speech.” Berry 
and Ejisenson [2] and Van Riper [7], believe 
that when an emotional disturbance coincides 
with the development of speech, there will be 
speech retardation. This is apparently due to 
the fact that speech is an overlaid function, 
with all of the anatomical parts used in speech 
having a more basic biological function. These 
parts, the tongue, palate, lungs, etc. are free 
for speech only when the organism is operating 
with a certain minimum of dynamic integra- 
tion. 


If the very young child is neglected or re- 
jected and reacts with highly motivated dis- 
organized behavior, it is easy to see how the 
development of speech could be influenced ad- 
versely. The correction of the emotional diffi- 
culty might therefore be considered an essential 
first step in delayed speech therapy. 


Delayed speech offers a unique challenge to 
the psychotherapist who emphasizes the growth 
aspect of therapy as the client-centered thera- 
pist does. If the client-centered therapy, as ad- 
vocated by Rogers [4, 5, 6], does result in re- 
leasing the potential for growth and improved 
interpersonal relations, then client-centered 
therapy should prepare the child for speech 
growth also. 

1A preliminary version of this paper was pre- 
sented at the annual convention of the Southern 
Society of Philosophy and Psychology, Section on 


Experimental Clinical Psychology, Knoxville, Ten- 
nessee, April 11, 1952. 


Recently Backus and Beasley, commenting 
on eight years of experimenting with programs 
of therapy, have stated that “the changes in 
speech behavior which took place depended less 
upon devices for breathing, blowing, tongue 
exercises, ‘ear training’ and the like, and more 
upon forces operating in the interpersonal re- 
lationship between child and therapist. . . . [1, 
p. 4]. It is even possible, as Werner [8] re- 
ports, that as the child’s emotional and social 
adjustment improves in therapy, his speech will 
improve without any treatment of the speech 
problem as such. This is a plausible prediction 
because therapy usually results in an improve- 
ment in social and interpersonal relations and 
speech is a social skill very essential to social 
and interpersonal relations. The social milieu 
of almost every child will contain adequate 
pressure and stimulation for correct speech, 
which the child will be able to respond to as 
his adjustment improves. 


It is our purpose in this paper to present 
and discuss the case history, diagnosis, and 
treatment of one case of delayed speech. The 
appropriateness of client-centered therapy as 
treatment will be discussed and evaluated. 


Case History and Diagnosis 


Johnny? was referred to one of the writers 
(a speech therapist) by a welfare agency for 
speech correction. His social worker reported 
that Johnny was having considerable difficulty 


2In order to protect the child’s identity, a ficti- 
cious name is used. 


122 


Research on Nondirective Play Therapy 


both in school and in his foster home. In fact, 
he was a problem to his social agency, for his 
speech made it impossible to find a satisfactory 
permanent foster home for him. 


When first seen by the therapist Johnny ap- 
peared small for an eight-year-old, he was thin, 
had delicate features, and a very forward and 
out-going manner. His attempts to speak re- 
sulted mostly in vowel sounds, frequently 
garbled. His hearing as measured by the 
Maico Clinical Audiometer was normal and he 
missed only three words out of thirty on the 
children’s version of the Anderson Speech 
Sound Discrimination Test. A phonetic an- 
alysis was made of Johnny’s speech for use in 
studying progress in his speech correction and 
for diagnosis. Because of his speech, an ade- 
quate measure of his IQ was not obtained, but 
his school performance and his general behav- 
ior did not suggest mental deficiency. 


A report submitted by the social worker pro- 
vided the following information: Johnny was 
the sixth of seven children. When he was three 
and while his mother was carrying her seventh 
child, his father was arrested for larceny and 
sent to jail for eight years. Shortly after this 
his mother became completely disorganized and 
unable to care for the children. It was not un- 
til one year later, however, that Johnny and 
his older brother, Tom, were taken from his 
mother and placed in a temporary foster home. 
One year and four months later Johnny and 
his brother were placed in another home sup- 
posedly for permanent placement. The new 
foster parents were unable to accept Johnny 
because he didn’t “talk right,” and he and his 
brother were returned to the temporary home. 
Tom was soon placed in another home, but 
Johnny remained in the temporary home 
where he was living at the time of his referral. 
From the above, and from his behavior in the 
interviews, it was felt that Johnny suffered 
from feelings of rejection to which he reacted 
with negative and aggressive behavior. A diag- 
nosis of “delayed speech” was made and it was 
decided that treatment should consist of client-, 
or in this case, child-centered play therapy. It 
was also predicted at this time that removal of 
the emotional problem would result in a 
spontaneous improvement in speech. 


123 


Description of Treatment 


Two therapists took part in Johnny’s ther- 
apy which consisted of forty-one play therapy 
interviews extending slightly over one year. 
The play sessions took place in a standard play 
therapy room. 

Because it was virtually impossible to under- 
stand Johnny, emphasis was placed on helping 
him perceive that he was accepted by the thera- 
pist. This is an aspect of therapy given em- 
phasis by Landsman [3]. An effort was made 
to show him that he was still liked and accept- 
ed even if his frequent limit-breaking could 


not be. 


A brief description of the pattern of the first 
several sessions provides a contrast with the 
last two sessions: In the first several sessions 
Johnny’s behavior followed a consistent pat- 
tern of aggression and destruction. In each 
session, after a brief period of examining the 
toys, he would begin to laugh and yell with 
apparent glee as he systematically put the room 
into shambles. 

The last sessions were spent in talking to 
the therapist about the things he was doing as 
he played constructively with blocks, pounded 
nails in a two-by-four, and painted. In the 
last session, he built a sailboat while discussing 
the fact that he wasn’t coming to therapy any- 
more. At one point, he reminded the therapist 
that he hadn’t brought more paint as he had 
promised he would. When the therapist sug- 
gested that maybe Johnny was a little angry at 
him for not keeping his promise he replied, 
“No, that’s all right.” At the end of the ses- 
sion he shook hands and said good-bye with no 
signs of difficulty. 

This nonquantitative observation of the im- 
provement in therapy is supplemented by the 
more objective phonetic analysis made of 
Johnny’s speech by the speech therapist. The 
before-analysis was made in one of the early 
diagnostic interviews. The other was made 
from a recording of the last play session. It 
must be emphasized that the phonetic analysis 
was made by a speech therapist who had no 
part in the treatment. 


The Speech Therapist’s Analysis of Speech 


The before-analysis was as follows: his 
speech is garbled and unintelligible; he seldom 








124 


attempts to speak. At its clearest, his speech 
shows the following omissions and substitu- 
tions in all positions: omissions: s, z, sh, zh, 
tch, dzh; substitutions: p for 1, w for r, f for 
th, v for th; & and ¢ are used indiscriminately 
as are g and d. Most often his speech consists 
solely of vowel sounds. 

The after-analysis was as follows: he initi- 
ates speech freely and frequently, he is almost 
always completely intelligible and only the fol- 
lowing errors still occur: s and z are substi- 
tuted for sh, zh, tch, and dzh. P is substituted 
for / in the medial position, and w for r in all 
positions. 


Table 1 


Substitutions and Omissions Usually Made in 
Speech Before and After Therapy 











Speech errors Before After 
Omissions 6 0 
Substitutions 6* 10T 





*Counting indiscriminate use of K and T, G and D. 
¢Counting all possible combinations of substitutions. 


An example of the way Johnny moved from 
omission-to-substitution-to-correct-sound would 
be his speech development on the word show. 
At first he said, “ow it to me”; then, “sow it 
to me”; and will soon (we predict) be say- 
ing “show it to me.” 

The social worker reported that both his 
teacher and his foster mother had noted and 
commented on his improved behavior and 
speech. Coincidental with the last several ses- 
sions of therapy, Johnny was being prepared 
for permanent placement in another foster 
home. This step was now believed practical 
because of his considerable improvement. 


Discussion 


As Van Riper [7] points out, in the develop- 
ment of speech the first sounds mastered are 
the vowels, then the labials, then the dentals 
and gutturals, then the complicated lip and 
tongue sounds, then the blends. When Johnny 
was first seen, his speech had stopped develop- 
ing at the vowel and labial level. This is ap- 
proximately the three-year-old level indicating 
that Johnny’s speech had developed little in the 
last five years. 





H. J. Dupont, T. Landsman, and M. Valentine 


As shown in Table 1, when last seen he had 
only the sh, zh, tch, dzh blends, and / and r to 
master. The disappearance of omission and the 
increase in substitution show, as the speech 
therapist reported, that he was more motivat- 
ed to speak as therapy progressed. This change 
in motivation to speak is considered a basic 
factor in the development of Johnny’s speech. 
As is generally agreed, speech is learned and it 
is therefore important that it be attempted and 
practiced. 

Adequate control is a very great problem in 
studies of this type. Certainly we do not even 
approach control of the many variables in the 
present study. If we were certain that Johnny’s 
speech was due to emotional maladjustment, 
and if we could be confident that no other 
factors except the therapy relationship were 
operating to help Johnny change, then we 
might conclude with some confidence that ther- 
apy without speech re-education is adequate 
treatment for delayed speech. All that can be 
said, however, is that it certainly appeared as 
though Johnny’s speech difficulty was due to 
his negativism and aggression and that as far 
as we knew there were no other changes in 
Johnny’s life except his relationship with the 
therapist. If Johnny’s speech remained ex- 
tremely retarded for five years only to improve 
very much during the one year he was visiting 
a play therapist, we can at least be suspicious 
of the therapy as a possible contributor to the 
improvement. We believe that Johnny’s speech 
improved automatically as he became better in- 
tegrated as an organism and more efficient in 
interpersonal and social interactions. What- 
ever the contributing factors, it is an observ- 
able fact that Johnny’s speech improved with- 
ont any speech instruction. 

It is dangerous to generalize from one case, 
but the results of this study tend to re-enforce 
the hypothesis that child-centered therapy with- 
out speech instruction is adequate treatment 
for some cases of delayed speech. Many addi- 
tional studies, more thorough than this one, 
will have to be completed before enough data 
will be available to prove or refute the present 
hypothesis. 

The results of the study seem to be favor- 
able toward the claims of client-centered ther- 
apists regarding the growth-releasing aspects 
of client-centered therapy, but these claims are 


Treatment of Delayed Speech by Therapy 


by no means proved. The results of the study 
seem to indicate that therapy can take place 
even when language communication is difficult 
and verbalized feelings are at a minimum. In- 
dicating to the child that he was accepted, even 
when not understood, was apparently adequate 
to carry the therapy. This is a point made by 
Landsman [3] and Rogers [4] both of whom 
stress the importance of accepting the clients’ 
attitudes and feelings even though limits may 
have to be set on his behavior, especially in the 
case of children. 


Summary 


In the foregoing report we have attempted 
to describe and evaluate child-centered therapy 
as a treatment method for delayed speech 
where an emotional disturbance is considered 
a causative factor. From the description and 
analysis of one case the following conclusions 
were indicated : 

1. The therapist’s observation of the child 
indicates improvement in emotional adjust- 
ment and intelligibility of speech. 

2. A comparison, by phonetic analysis, of 
the speech before and after therapy indicates 
that improvement occurred without speech in- 
struction. 

3. The results of the study are in general 
consistent with the point of view held by 


125 


Backus and Beasley [1] except that we would 
emphasize the importance of being child 
centered rather than speech centered. 

4. The results tend to re-enforce the hy- 
pothesis that child-centered therapy can be ade- 
quate treatment for some types of delayed 
speech. The potential value of additional re- 
search on child-centered therapy as treatment 
for delayed speech instead of speech-centered 
therapy is demonstrated by this study. 


Received September 20, 1952 


References 


1. Backus, Ollie, & Beasley, Jane. Speech therapy 
with children. Boston: Houghton Mifflin, 1951. 

2. Berry, Mildred F., & Eisenson, J. The defective 
in speech. New York: Appleton-Century, 1942. 

3. Landsman, T. Counseling for emotional prob- 
lems. Delaware Sch. J., 1951, 16, 6-7. 

4. Rogers, C. R. Significant aspects of client-cen- 
tered therapy. Amer. Psychologist, 1946, 1, 415- 
422. 

5. Rogers, C. R. Divergent trends in methods of 
improving adjustment. Hare. educ. Rev., 1948, 
18, 209-219. 

6. Rogers, C. R. Client-centered therapy. Boston: 

Houghton Mifflin, 1951. 

Van Riper, C. Speech correction. 

New York: Prentice Hall, 1939. 

8. Werner, L. S. Treatment of a child with de- 
layed speech. J. Speech Disorders, 1945, 10, 
329-334, 


™ 


(2nd Ed.) 








Journal of Consulting Psychology 
Vol 17, No 2, 1953 





Response to the Human Face as a Standard Stimulus 


Ernst G. Beier, Carroll E. Izard, Charles D. Smock, 
and Rolland R. Tougas 


Syracuse University 


In his investigation of the underlying as- 
sumptions of the TAT and related projec- 
tive techniques, Lindzey observed, “Although 
little effort has been made to explore the vari- 
ations in fantasy productions between many of 
the important groups of our own society, it is 
quite widely accepted or expected that these 
differences exist. Even such an important 
cleavage as that between male and female has 
been little explored so far as TAT behavior 
is concerned, while such variables as socioeco- 
nomic status, occupational role, and ethnic 
group-membership have also been of slight 
interest to most investigators” [7, p. 17]. His 
thoughtful analysis emphasizes the importance 
of research in this area. Investigations at- 
tempting to differentiate among the reactions 
of various age categories and sexual groups 
to a standard stimulus may indeed serve to 
help us to obtain more refined data. 

Numerous studies, in the past, have been 
devoted to problems of differential reactions 
utilizing the human face as a standard stim- 
ulus. For example, the importance of the face 
as the mediator of expression, of interpersonal 
responsiveness, and of overt contact with the 
world at large is generally recognized. Fur- 
ther, at the moment of emergence of social 
consciousness in the infant, it is likely that the 
face of the mother (or nurse) serves as 
stimulus. Recent evidence [4] lays great stress 
on the cathexis or “imprinting” value of such 
awareness at the emergent moment, and may 
well underlie the erotic values attached to 
certain faces or the repulsiveness of others. 
Buhler [1] relates that the second month of 
life features social responsiveness as evidenced 
by an infant’s exclusive reaction to the human 
face. “The child of two to four months only 
smiles when it sees the face of a person, but 
not when its attention is directed to colours, 


12 


shining objects, or a live cat” [1, p. 159]. 
Machover [8] in establishing the rationale 
for the interpretation of drawings of a person, 
sees the face (or head) as the most representa- 
tive part of a person. ““The face may be re- 
garded as the social feature of the drawing, 
a and is the most expressive part of 
the body” [8, p. 40]. Further, “The head of 
the adult is the most important organ relating 
to the emotional security of the child” [8, 
p. 39]. 

Recognition of the importance of the human 
face as stimulus in social functioning, attitud- 
inal reactions, and early sensitization general- 
ly, led several investigators into different di- 
rections in search of its intrinsic values. 
Landis [6] and Frois-Wittman [3] have 
used the human face in the study of emotional 
qualities, the social psychologists [10] have 
used it as a stimulus to elicit prejudices, and 
Szondi [2] bases his projective technique on 
value judgments about photographs of human 
faces, to mention only a few. Through the 
use of this medium, important results were 
derived but additional basic knowledge would 
seem desirable. 

The present paper is a report on a study 
designed to investigate two basic problems 
related to the human face as a stimulus. The 
first problem is one of cultural preference, and 
the second that of differential reaction of the 
two sexes to the stimuli. For purposes of the 
present study, the term cultural preference 
intimates the general meaning a stimulus has 
to large numbers of individuals living within 
the boundaries of a particular social unit. An 
illustrative application of the term cultural 
preference may be found when thinking of the 
facial structure of a movie star; one could 
predict that a majority of people in our 
culture would register a “like” response for 


0 


Response to the Human Face as a Standard Stimulus 


such a picture. Again, it is known that a large 
number of persons in our culture prefer the 
photograph of the only young face in the 
Szondi pictures. Neither of these facts gives 
us enough information to formalize our 
knowledge about the individual who has re- 
sponded in such a wav. In the present study, 
an attempt was made to throw more light on 
the problem of cultural preferences. Specif- 
ically, which cultural preferences are observ- 
able when we break down a large sample of 
photographs into various age and sex groups? 
What is, by and large, the feeling toward 
semblances of older people, those of younger 
people, toward males and toward females? 


Group reactions to as highly structured a 
stimulus as the human face conceivably conceal 
large and important differential responses be- 
tween male and female subjects. In the pre- 
sent study, this possibility was investigated by 
breaking down the total sample into subparts 
of male and female subjects. Accordingly, dif- 
ferential sex responses to varying age and sex 
groups could be determined directly. 


It might be argued that in such complex 
stimuli as photographs of human faces, the 
individual face would be more important than 
the specific group to which it belonged and 
that generalizations could not be made from 
such findings. It is here contended, however, 
that the present selection of pictures as well 
as the large number of pictures used are like- 
ly to counteract these difficulties. 


Procedure 


In order to obtain answers to the questions 
regarding (a) the nature of cultural prefer- 
ences, and (6) the quality of differential re- 
sponses between the sexes, a large number of 
photographs cf various sex and age groups 
was obtained and presented to male and 
female subjects for preference ratings. For 
purposes of analysis of the data, it would then 
be possible to break down the stimulus photo- 
graphs into subgroups according to age and 
sex and divide the subjects into male and 
female groups. 

Subjects. The sample’ consisted of 45 males 
and 31 females enrolled in two sections of a 

*The writers wish to express their appreciation 


to Dr. Roland McKee for his cooperation in mak- 
ing his classes available for this study. 


127 


general psychology course at Syracuse Uni- 
versity. The age range of the subjects was 18 
to 28 with a mean age of 20.41 years. 

Selection of photographs. From a pool of 
over 1000 pictures of human faces, gathered 
from a variety of sources such as journals and 
yearbooks, a sample of 250 was selected. Four 
clinical psychologists served as judges. Selec- 
tions were made on the basis of the following 
criteria: (a) full-face view; (4) nonemotion- 
al expressions (no laughter, anger, or fear) ; 
(c) equal numbers of males and females; (d) 
age groups to cover normal life span; and (e) 
pictures rated unequivocally by the four 
judges were excluded, to obtain maximum 
differentiation. 

The full-face view was chosen in order to 
equate the photographs according to structure. 
The lack of a specific emotional expression 
was chosen as a selection criterion because it 
was felt that certain emotional qualities in a 
face might introduce unaccountable variables. 
The prejudgment of photographs was under- 
taken to eliminate such pictures as movie stars 
and beauty queens. It was felt that this type 
of picture would be chosen unanimously and 
thus preclude differentiation. The final pic- 
tures chosen reflected certain categories, divid- 
ing the photographs into younger, peer, and 
older age groups. 

Table 1 presents a distribution of the se- 
lected photographs of human faces according 
to age and sex. 


Table 1 
Distribution, According to Age and Sex, of Sample 
of Photographs of Human Faces 








Age Range 





Sex 1-3 _ 4-16 17-30 40 Total N 
Male 29 75 19 123 
Female 28 72 22 122 
Infants 5 5 
Totals 5 57 147 41 250 





Administration. The photographs were at- 
tached to a strip of tape in random order and 
then projected on a screen by a standard 
opaque projector. The size of the picture and 
the fixation point were kept constant. The 
screen and seating arrangement were so de- 
signed as to present equal opportunity for 
viewing, and minimizing and equalizing pos- 
sible distortions. Subjects were presented each 








128 


picture for five seconds (timed), during which 
interval judgments were to be made and their 
responses recorded on the answer sheet, as per 
the following instructions: 


We have here a series of 250 portraits of people’s 
faces. Obviously, when we are confronted with a 
stranger, looking at him and at his face particu- 
larly, we get certain first impressions about the per- 
son. In this research, we are seeking for your first 
impression. We will present you with the portraits 
of 250 individuals and some of them will impress 
you as real likeable people whose company you 
would appreciate. Some will impress you as people 
you wouldn’t care to meet or associate with at all. 
Remember, we want your first impression and that 
we recognize that your first impression would be 
modified if you knew the person. In between the 
more extreme likeable and not liked faces, you will 
find many faces that will impress you as either just 
o.k. or mildly disliked, both pretty indifferent really. 
On the answer sheet, we want you to mark A for 
the very likeable; B for the mildly likeable; C for 
mildly disliked; and D for the definitely disliked. 
You will have 5 seconds to judge each picture. Each 
picture must be judged. In other words, each pic- 
ture should be rated either 4, B, C, or D. Remem- 
ber: A: a very positive yes; B: just yes; C: just 
no; D: very definitely no. Remember: First impres- 
sion only. The time is short— only 5 seconds in 
which to glance at the picture and then to record 
your answer. Are there any questions as to what is 


to be done? 


All responses were recorded on IBM answer 
sheets and machine scored. 

In analysis of the data, 4 and B categories 
were combined as “likeable,” and C and D 
categories were combined as “dislikable.” A 
further breakdown of categories would yield 





E. G. Beier, C. Izard, C. Smock, and R. Tougas 


an analysis of the intensity of “likes” and “dis- 
likes” but is not considered part of the pres- 
ent study. 


Table 2 


Distribution of Preference Ratings by Males and 
Females on Test and Retest 











Measure Male (N45) Female (N31) 
Like Dislike Like Dislike 
Test 6595 4655 4775 2975 
Retest 6731 4519 4778 2972 








A retest yielded a frequency distribution of 
preference ratings as shown in Table 2. An 
unpublished dissertation [5] has indicated the 
essential stability of preferential ratings for 
pictures of human faces as stimuli. 


Results 


The demonstration of cultural preferences, 
as defined in this study, may be observed from 
an analysis of the data in Table 3. The data 
presented are for the total sample of male and 
female subjects combined, and are representa- 
tive only of the 17-30 age group. 


For the group as a whole, inspection of 
Table 3 reveals that only two age-sex groups 
fail to be differentiated from each other. These 
are (a) peer males (17-30) from older males 
(40+), and (4) older males (40+) from old- 
er females (40+). In effect, the subjects di- 
vided the categories into two groups; one old- 
er than themselves and the other younger than 
themselves. This finding emphasizes that age 

















Table 3 
Chi-Square Values Comparing Ratings of Sex and Age Groups With Each Other (df—1) 

Young Young Peer Peer Older Older 

Male Female Male Female Male Female 

(4-16) (4-16) (17-30) (17-30) (40+-) (40+) 
Infants (1-3) 31.77* 16.76* 186.29* 126.72* 157.58* 199.75* 
Young male (4-16) ............ -_ 10.53* 338.99® 171.39* 207.38* 290.56* 
Young female (4-16) -........... 465.77® 269.51* 294.56* 392.68" 
Peer male (17-30) —..................- 53.09* 08 7.84* 
Peer female (17-30) —........ 24.62* 60.13" 
Older male (40+-) ~...-.....-..~ 3.77 

*Significant at the 1% level of confidence. 
Cut-off points for chi square: 6.64=.01 level. 


Cut-off points for chi square: 10.83=.001 level. 





Response to the Human Face as a Standard Stimulus 


is an important factor in the measurement of 
preferences relative to the human face. 

The peer-age female category, interestingly 
enough, was placed with the younger group, 
and the peer-age male category with the older 
group. Thus, when considering the peer-age 
category, age and sex both seem important de- 
terminants in preferential ratings, whereas in 
all categories combined, age alone appeared 
the more important factor. 


Table 4 


Cultural Preferences for Age and Sex Groups 
for All Subjects 











Ratings of All 
Subjects 
Like Dislike x* p 
Infants (1-3) 334 46 221.62 .001* 
(N:5) 
Young male 
(4-16) 1645 559 549.67 .001* 
(N: 29) 
Young female 
(4-16) 1677 451 707.27 .001* 
(N: 28) 
Peer male 
(17-30) 2956 2744 8.13 01 * 
(N:75) 
Peer female 
(17-30) 3213 2259 166.56 .001* 
(N:72) 
Older male 
(40+-) 743 801 5.78 .02 * 
(N: 19) 
Older female 
(40+) 802 870 15.51 .001* 
(N: 22) 





*Significant at the level of confidence established for 
this study. 


Up to this point, evidence has been present- 
ed to indicate that some age-sex groups are 
singled out by the subjects for preferential re- 
sponses. In Table 4, it may be seen that the 
subjects responded to all age-sex groups in a 
significantly different manner. The reader will 
note, however, that in two of the seven groups 
(older male and older female), the “dislike” 
responses are larger than the “like” responses, 
while in the other five categories the “like” 
responses are larger. These data could be in- 
terpreted as indicative of a cultural preference 
rating. The subjects professed “liking” for 


129 


photographs of human faces of individuals of 
peer age and younger than themselves, while 
they professed a “disliking” for faces of indi- 
viduals in the category older than themselves. 
Additionally, large differences were also noted 
within the five “like” categories. This sug- 
gested the possibility of ranking the relevant 
age-sex categories in order of the observed 
preference. ‘This study, however, was not de- 
signed for such close differentiation, and pre- 
cise ranking could not be established. The ap- 
proximate ranking given below might be sug- 
gestive of further hypotheses. The ranking 
was obtained by the following formula: 


(Like responses) - (dislike responses) 





X 100 
Number of pictures in each category 

In this manner, a “like’’ per picture score was 

obtained, as shown in Table 5. The ranking 

would indicate the approximate position of 

each category in terms of cultural preference. 


Table 5 


Preference Scores for Age Groups 











Age group Preference score 
Infants (1-3) 57.6 
Young female (4-16) 21.5 
Young male (4-16) 19.0 
Peer female (17-30) 6.5 
Peer male (17-30) 1.5 
Older female (40+) — 3.10 
Older male (40+) 


— 3.05 





The contrast in “liking” expressed for chil- 
dren as opposed to the “disliking” directed 
towards adults is indeed striking. Further in- 
vestigation of this phenomenon would seem in- 
dicated. 

Inspection of Table 6 tells us that in four 
of the seven categories differential responses 
were found. In their response to infants, to 
young males, to older males, and to older fe- 
males, the male subjects respond with a sig- 
nificantly different Like:Dislike ratio than 
do the female subjects. In their responses to 
peer-aged men and women, however, as well 
as to younger girls, the male and female sub- 
jects do not differ from each other. For all 
categories combined, men react differently 
from women in a statistically significant man- 
ner. These findings lend support to our hy- 








130 


Table 6 


Variation of Preferences for Age and Sex Cate- 
gories of Male and Female Subjects 











Males Females 
Dis- Dis- 
Like like Like like x? p 
Infants 
(1-3) 189 36 145 10 7.86 .01 * 
(N:5) 
Young male 
(4-16) 930 375 715 184 19.22 .001* 
(N:29) 
Young female 
(4-16) 982 278 695 173 1.40 .25 
(N: 28) 
Peer male 
(17-30) 1741 1634 1215 1110 25 .60 
(N:75) 
Peer female 
(17-30) 1894 1346 1319 913 22 .60 
(N:72) 
Older male 
(40+) 420 435 323 366 4.56 .05 * 
(N:19) 
Older female 
(40+) 439 551 363 319 12.77 .001* 
(N: 22) 
Total N: 17.35 .001* 





*Significant at the level of confidence established for 
this study. 


pothesis that caution should be exercised in 
attempting to establish common norms for 
both men and women in reference to human 
faces and possibly to all standard projective 
stimuli. 


Discussion 


These preliminary findings are reported here 
because they seem to be of sufficient impor- 
tance to throw doubt on some of the basic 
assumptions underlying some present-day uses 
of the medium under consideration. Definite 
cultural preferences were shown to be operat- 
ing in the qualitative responses to age and sex 
groups through the utilization of human faces 
as stimuli. The subjects showed more “liking” 
for faces of peer-age females and younger peo- 
ple, and less “liking” for faces of older peo- 
ple and peer-age males. It becomes clear that 
more such information will have to be made 
available in order to fully understand an in- 
dividual’s responses. 





E. G. Beier, C. lzard, C. Smock, and R. Tougas 


These findings seem to mirror a cultural 
pattern when we discover that older people 
are the only categories which received a ma- 
jority of “dislike’’ ratings. It reflects to some 
extent the role of the older individual in our 
society and the sympathy he might expect from 
a younger group. It is, however, altogether 
likely that other cultures would demonstrate 
somewhat different patterns of preferences. 
Further, it is interesting to note that the male 
subjects responded differently from the fe- 
male subjects to almost all age-sex categories 
with the exception of that representing their 
own age. Is this an artifact, or have we here 
an implication of some specific attributes in 
the development of the cultural-sexual role of 
an individual? 

It was also discovered that responses to hu- 
man faces by and large differentiate men and 
women, which indicates that separate norms 
for the sexes should be considered in order 
to understand the individual’s responses more 
thoroughly. What remains hidden within nor- 
mative data which ostensibly differentiate one 
sex from the other can only be hinted at here. 
It would seem possible, however, that the so- 
cial and sexual roles one learns to adopt, the 
sources and direction of identifications, and 
the elements of one’s self concept may be re- 
flected in qualitative responses to human faces. 
Machover’s monograph [8] alludes to these 
potentials from an interpretive point of view. 
That certain faces give expression to one’s 
canalized aspirations or expectancies and thus 
cue off subjective wishes or identifications 
would seem pertinent to such age-sex differ- 
entiations as were found in this study. This 
would seem to embody the transferring from 
previous experience a host of stereotypes and 
long-standing conditioned responses to such 
features as the expression, age, and sex of a 
face. For example, young females have only to 
be pretty and decorative to command social 
attention. Again, the need to please others is 
frequently observed in them. Boys, on the 
other hand, are expected to make rapid strides 
in development of physical and sexual power. 
The influence of such cultural trends on be- 
havior and personality development needs 
clarification. An unpublished dissertation by 
one of the authors is attempting to open for 
consideration the meaning and associational 


Response to the Human Face as a Standard Stimulus 


content behind ratings of like and dislike for 
human faces as a further step in this direc- 
tion. 


Summary and Conclusions 


An experiment was designed to investigate 
the presence of cultural preferences and sex 
differences in the response of young adults to 
the human face as a standard stimulus. Sub- 
jects for the study were 76 Syracuse Univer- 
sity students, 45 male and 31 female. They 
ranged in age from 18 to 28, the mean age 
being 20.41 years. 

In order to investigate the presence of cul- 
tural preferences to human faces, a series of 
250 pictures covering infancy to old age in 
both sexes was shown. The responses of the 
total sample were statistically evaluated 
through the chi-square technique. For deter- 
mination of sex differences, the responses of 
males and females to each age-sex group were 
statistically analyzed by the same technique. 

The study revealed the presence of clear 
cultural preferences for faces of people of peer 
age and younger. These were rated signifi- 
cantly more “liked” than “disliked.” 

The utility of normative data was shown 
questionable through the demonstration, with- 
in the limits of the particular sample em- 
ployed, of differential responses by males and 
females of the same age to pictures of differ- 
ent age-sex groups. 


131 


The limitations that these results impose on 
normative data in response to standard stim- 
uli and possibly all projective stimuli were 
briefly discussed. 


Received August 4, 1952. 


References 


1. Buhler, K. The mental development of the 
child. New York: Harcourt, Brace, 1930. 
Deri, Susan. Introduction to the Szondi Test; 
theory and practice. New York: Grune & Strat 
ton, 1949. 

3. Frois-Wittman, J. The judgment of facial ex- 
pression. J. exp. Psychol., 1930, 13, 113-151. 

4. Hutchinson, G. D. Marginalia. Amer. Scten- 
tist, 1952, 40, 146-153. 

5. Izard, C. E. Perceptual responses of paranoid 
schizophrenics and normal subjects to photo- 
graphs of human faces. Unpublished doctor’s 
dissertation, Syracuse Univer., 1952. 


to 


6. Landis, C. Studies of emotional reactions, II. 
General behavior and facial 
comp. Psychol., 1924, 4, 447-507. 

7. .Lindzey, G. Thematic Apperception Test: In- 
terpretive assumptions and related empirical 
evidence. Psychol. Bull., 1952, 49, 1-25. 

8. Machover, Karen. Personality projection in the 
drawing of the human figure. Springfield, IIL: 
Charles C Thomas, 1950. 

9. Murray, H. A. The effect of fear upon esti- 
mates of the maliciousness of other personal- 
ities. J. soc. Psychol., 1933, 4, 310-329. 

10. Sherif, M. An experimental study of stereo- 
types. J. abnorm. soc. Psychol., 1935, 29, 371- 
375. 


expression. J. 





Journal of Consulting Psychology 
Vol. 17, No. 2, 1953 





Communication and Rapport in 
Clinical Testing’ 


David Cole 


Occidental College 


Developments during the past few years 
in the fields of counseling and psychotherapy 
have added immensely to our understanding 
of the nature of communication in the thera- 
peutic situation, and of factors which en- 
courage or inhibit communication between 
patient and therapist. We have come to regard 
good communication as prerequisite to good 
psychotherapy. 


The purpose of this paper is to suggest 
that the insights which we have gained here 
also have implications in the procedures em- 
ployed in diagnostic psychological testing. In 
this area we find that traditionally the empha- 
sis as far as the interpersonal relationships 
are concerned has been upon rapport rather 
than upon communication as such. In eval- 
uating the validity of results of such testing 
we have always been concerned with whether 
good rapport existed between subject and ex- 
aminer or whether the performance of the 
subject seemed to be blocked by the nature of 
the immediate interpersonal relationship. We 
have assumed that good rapport was necessary 
for valid testing. 


Most psychometrists will probably accept 
the hypothesis that a close relationship exists 
during testing between communication and 
rapport between subject and examiner. As 
the level of communication improves, rapport 
will improve, and conversely, those factors 
which engender rapport will improve com- 
munication. 

We are concerned with communication in 
all types of clinical testing. In the use of pro- 


1Paper presented at the West. Psychol. Ass., Fres- 
no, Calif., April 25, 1952. 


132 


jective techniques it is of great importance 
that the subject feel free to convey to us the 
responses which occur to him without 
conscious screening or censoring, and on the 
other hand that we get a clear picture of 
what the particular response is that the sub- 
ject is attempting to describe. Probably many 
have had the experience in administering the 
Rorschach of being confronted with a re- 
sponse wherein it was very difficult to as- 
certain from our subject just how he was 
using the blot material. We were left with 
the unpleasant speculation as to whether the 
apparent fault in perception was his or our 
own. Again in intelligence testing it is. of 
utmost importance that if we are genuinely 
to get the qualitatively best performance from 
our subject, that a high level of communica- 
tion exist. In general, persons doing intelli- 
gence testing for all age levels have followed 
along the lines of rapport building suggested 
by Terman and Merrill in the manual for the 
Stanford-Binet: that “nothing contributes 
more to satisfactory rapport than keeping the 
child encouraged,” [1], and further that “in 
general it is wise to praise frequently and gen- 
erously, but if this is done in too lavish and 
stilted a fashion it is likely to defeat its pur- 
pose.” Taking these statements as a starting 
point, and despite the qualifications injected 
by Terman and Merrill, persons doing intelli- 
gence testing have beén prone to use praise as 
their chief medium of verbal exchange with 
their subjects, praising profusely and some- 
times with too little relation to the qualitative 
performance of the subject. When a subject 
failed an item, they followed Terman’s sug- 
gestion and made “some excuse for it.” 





ei EOLA Mint 


Communication and Rapport in Clinical Testing 133 


It is particularly to this approach to rapport 
building in intelligence testing that this paper 
is addressed. Our research in psychotherapy 
has suggested a few things about the value of 
reassurance in facilitating communication. We 
have found that in many cases, reassurance, 
or an evaluative response on the part of the 
therapist, actually serves as a barrier to further 
communication. The fact that the response 
has been evaluated may indicate that all re- 
sponses will be so evaluated, and the client is 
thrown on the defensive, setting up a process 
of screening his statements more carefully 
before presenting them to his critic. This same 
process in all probability works during the 
administration of an intelligence test. Here, 
with the purpose more often openly evalua- 
tive, we consistently find that many subjects 
enter the testing environment on the defen- 
sive. Only the ament or the emotionally in- 
fantile is so lacking in autocriticism that he 
is unable to judge qualitative differences in 
his responses, and is unable to discriminate 
success from failure in some test items. Thus, 
hasty reassurance may often go in conflict 
with the recognized quality of the responses 
as seen by the subject, and increase the inci- 
dence of blocking and refusal to respond. 

The same processes which reduce blocking 
in psychotherapy are appropriate and desirable 
in diagnostic testing. If instead of giving 
fluent and often unwarranted reassurance con- 
cerning our subject’s performance, we make 
a sincere attempt to understand what the 
situation means to him, and attempt to com- 
municate to him this awareness, we may 
raise the level of communication, and hence 
get a more valid picture of what kind of a 
person we are working with. Our aim in 
testing as far as the establishment of rapport 
is concerned should be to capture the frame 
of reference of our subject and to communi- 
cate to him our appreciation of his situation. 

This approach may influence our admin- 
istration of standard intelligence tests in a 
number of ways. With some subjects, the 
initial reaction to the test will be to be appre- 
hensive, cautious, and defensive. They will 
indicate this to you through their remarks and 
behavior as they prepare for the test. Our 
reaction may recognize this apprehensiveness, 
letting a subject know that we appreciate 


how he feels, and can understand why he 
might feel that way. His comments may be 
intrapunitive, indicating a doubt in his own 
ability, “I always do lousy on these things,” 
or may be extrapunitive, “Are these tests 
really what they are cracked up to be?” He 
may communicate through behavior more than 
through words, by sitting rigidly in his chair, 
arms tightly folded, or by conspicuously and 
suspiciously eyeing the stop watch. Reacting to 
such feelings as were just cited with comments 
such as “you are afraid you never do your- 
self justice on a test,” “you think test scores 
aren't the whole story,” “you seem a little 
tense,” or “you think the timing may bother 
you,” can do much to clear the air and reduce 
the hierarchical structure which too often 
exists during testing. 

In proceeding through the test, the subject 
usually first encounters easy items. Various 
reactions may result. One will find this situa- 
tion relieving; he is glad that he is able to 
handle the initial material. This reaction may 
be recognized; we may indicate that we 
understand that he is happy to find easy 
going. Another subject will react to easy items 
with hostility. He is insulted that we should 
feel such material was worthy of his abilities. 
This too can be recognized and accepted as 
a reasonable reaction from the person con- 
cerned. As the testing proceeds most subjects 
will become aware that the items are more 
difficult, and that they are finding it more of 
a problem to produce correct responses. They 
will react to this in accordance with their own 
personality and the nature of existing rapport, 
blaming you, blaming the test, blaming them- 
selves, or accepting the fact that the test is 
now beginning to separate the men from the 
boys. In the interest of good rapport and 
communication it is good to accept these atti- 
tudes as expressed, to let the person know that 
you realize that the items are difficult, that 
he is uncomfortable (if this is his feeling), 
and that he will be glad to get out of hot 
water. 

Because most intelligence testing makes an 
attempt to reach the point where the subject 
can no longer answer correctly, the wide use 
of praise can easily lead to a situation where 
the examiner is confronted with the alter- 
natives of either abandoning his previous ap- 





134 


proach, with the obvious implication that 
praise is no longer deserved, or continuing 
to praise in situations where the subject him- 
self is aware that performance is inadequate. 
To attempt to draw a fine line between prais- 
ing the subject’s effort, and praising the re- 
sults of his effort, is probably often to exceed 
the discriminative capacities of both subject 
and examiner. 


The hypotheses and suggestions here pre- 
sented have proved useful in teaching rapport- 
building techniques to graduate students in 
testing courses during the past two years. This 
approach has been more productive of good 
rapport than resulted from other methods. 
As empirical evidence of the nature of the 
rapport, students using these techniques have 
been conspicuously more successful in obtain- 





David Cole 


ing subjects who are willing to be retested 
on the alternate forms of the Binet and 
Wechsler. Inherent in this approach is respect 
for the person being tested and a recognition 
of the fact that his feelings are genuine and 
enter into the testing situation. If what has 
been found valid in one area of interpersonal 
relations has validity in another area, then 
such an attitude may be expected to facilitate 
communication, rapport, and thus aid in ob- 
taining a more valid picture of the personality 
under investigation. 


Received August 19, 1952. 


Reference 


1. Terman, L. M., & Merrill, Maud A. Measuring 
intelligence. Boston: Houghton Mifflin, 1937. 





Gane of Consulting Psychology 
ol. 17, No. 2, 1953 


A Validation of Changes in Scores on the Index of 
Adjustment and Values as Measures of 
Changes in Emotionality ° 


Robert E. Bills 


University of Kentucky 


In a study of the validity of the Index of 
Adjustment and Values [1], Roberts [2] 
compared the emotionality of the Index traits 
which were given high ratings by subjects with 
traits which were given low ratings. Roberts 
used the Index traits in free association and his 
criterion of emotionality was response time. 
He concluded that the ratings the subjects 
gave themselves on the Index were valid in- 
dices of the emotionality of the traits for the 
subjects. 


The present study was designed to verify 
Roberts’ conclusions and to investigate changes 
in emotionality when changes in ratings occur 
from test to retest. 


Design 

Fifty volunteers, students at the University 
of Kentucky, were administered the Index of 
Adjustment and Values at the beginning of a 
semester. During the following week these 
students were given, individually, the 49 traits 
of the Index in free association. The subjects 
were instructed to respond as rapidly as possi- 
ble with the first word which “came to mind,” 
and reaction times were measured by means of 
a chronoscope and voice key. The order of 
presentation of the stimulus words was ran- 
domized and a different order was used for 
each subject and each of the two presentations. 


These data were used to verify Roberts’ con- 
clusions. 


Fourteen weeks after the subjects had been 


1The writer is indebted to the University of 
Kentucky Research Fund Committee. A grant-in- 
aid from this fund has made the study possible. The 
writer would also like to thank Glen E. Roberts for 
his invaluable aid as assistant on this project. 


tested with the Index they were retested with 
this instrument, and during the following 
week they were retested with the free associa- 
tion test. 

Results 


Data from the first test. The data from the 
first Index and the first free association test 
were used to verify Roberts’ conclusions. 


The Index of Adjustment and Values re- 
quires that a subject make three ratings on a 
five-point scale for each of 49 traits. These 
ratings are arranged in three columns which 
have been designated by the Index authors as 
concept of self, acceptance of self, and the con- 
cept of the ideal self. A fourth score, called 
discrepancy, is obtained by totaling the differ- 
ences between concept of self and concept of 
the ideal self. The three analyses performed 
in this section related to the ratings on three of 
these scales. 


In the first analysis average reaction times 
were determined for each subject on the traits 
he had rated 4 or 5 and 1 or 2 on the concept 
of self. In this manner two distributions of 
50 averages were obtained. Each distribution 
contained an average for each subject. In all 
calculations, words on which blocks occurred 
or on which a subject had a reaction time in 
excess of 6.81 seconds (3 times the standard 
deviation of the distribution of mean reaction 
times of the subjects on each trait) were ex- 
cluded from the analysis. A test of the signifi- 
cance of the difference between the means of 
the above two distributions gave a ¢ of 1.111. 
This may be interpreted, at the .30 level, to 
mean that there is insufficient reason to assume 
that the two distributions are not random 
samples from the same population. 


135 





136 


The data were next analyzed for differences 
in acceptance of self. Average reaction times 
were calculated for all of the traits rated 1 or 
2 (rejection of self) and 4 or 5 (acceptance of 
self). Again, this yielded two distributions of 
50 averages, and each distribution contained 
an average for each subject. A comparison of 
the mean difference gave a t of 1.88 which is 
significant at the .06 level of confidence. 

The third analysis was for discrepancy 
scores; the differences between the concept of 
self and the concept of the ideal self. Average 
reaction times were determined for each sub- 
ject on all traits which showed no discrepancy 
and these averages were compared with the 
average reaction times of traits which showed 
a discrepancy. This comparison gave a ¢ of 
1.80 which is significant at the .08 level. ‘Table 
1 summarizes the data from the three analyses. 


Table 1 
Comparison of Free-Association Reaction Times for 
High and Low Ratings on Concept of Self, 
Acceptance of Self, and Discrepancy 











e 
3 
ee 
ee «3, “EEE 
$22 ese 
S23 wat ~ a 
Concept of self 
1,2 2.60 
09 1.11 -30 
4,5 2.70 
Acceptance of self 
1,2 2.80 
08 1.88 -06 
4,5 2.65 
Discrepancy 
0 2.62 
.05 1.80 .08 
1, 2, 3,4 2.71 





Test-retest data. Test-retest data were an- 
alyzed to determine if significant differences in 
reaction times occurred when trait ratings 
were changed from test to retest. Three groups 
of analyses were performed on the ratings of 
concept of self, acceptance of self, and discrep- 
ancy. 

Four distributions of average reaction times 
were calculated for each subject on the basis 
of test-retest ratings on concept of self. Aver- 





Robert E. Bills 


age reaction times were determined for the 
first and second tests on traits which were 
rated higher on the retest than on the first test 
and the same averages were obtained for traits 
which were rated lower on the retest than on 
the first test. Thus, four distributions of 
average reaction times were obtained, and each 
distribution contained an average for each sub- 
ject. Similar distributions of averages were 
calculated for test-retest ratings on acceptance 
of self and discrepancy scores, and, thus, a total 
of twelve distributions was obtained. The 
means of these distributions are presented in 
Table 2. Subsequently, Index ratings which 
were higher on the retest than on the first test 
will be designated “+,” and those which de- 
creased on retest “—.” 

Each of the pairs of “+” and “—” distri- 
butions were used to give a distribution of 
differences. The means of these six distribu- 
tions of differences are also contained in Table 
2. These mean differences represent the change 
in reaction times for “+” and ““—” traits from 
test to retest and they show that average re- 
action times were faster on the retest than on 
the first test. 


Table 2 
Means and Mean Differences of Reaction Times to 
Trait Words When Reaction Times Are Arranged 
According to Increase or Decrease in Ratings 
on Traits from Test to Retest 








Mean of means Mean of 





Rating Ist. test 2nd. test differences 

Concept of self 

Increase (“-+’)* 2.61 2.51 -.10 

Decrease (“—”)T 2.70 2.42 ~—.22 
Acceptance of self 

Increase (“‘-+-”)* 2.67 2.53 -.14 

Decrease (“—”)T 2.66 2.39 -.27 
Discrepancy 

Increase (“-+-”)* 2.77 2.53 —.24 

Decrease (“—”)T 2.61 2.50 -.11 





* Mean reaction times of words which received a high- 
er rating on the Index on retest than on first test. 


+ Mean reaction times of words which received a lower 
rating on the Index on retest than on first test. 


The above six distributions of differences, 
three “+” and three “—,” were used to calcu- 
late distributions of gain scores. Gain scores 
represent the differences in the changes in re- 
action times for “+” and “—” traits from 





Changes on the Index of Adjustment and Values 


test to retest. If the means of these gain scores 
are significantly different from zero, it may be 
concluded that changes in reaction time to “+” 
and ““—”’ traits are significantly different. The 
means, estimated standard errors, t’s, and prob- 


abilities for the gain scores are presented in 
Table 3. 


Table 3 
Significance of Means of Gain Scores for the 
Three Index Ratings 











5 

| 
Rating Pa og 

ic al 

= & ne ~ “ 
Concept of self -.12 .10 1.20 .30 
Acceptance of self -.14* .07 2.00 .05 
Discrepancy A$ 06 2.17 05 





*Beeause of “rounding errors,” this mean does not 
agree exactly with the difference between the mean 
differences shown in lines 8 and 4 of Table 2, 


Discussion 


The data confirm Roberts’ conclusion that 
the Index of Adjustment and Values is a valid 
measure of emotionality, at least as far as can 
be established by using reaction time in free 
association as a criterion measure. 

The data show, likewise, that changes in 
ratings of concept of self from test to retest 
are not accompanied by changes in the emo- 
tionality of the traits for the subjects. Roberts 
[2] demonstrated that ratings of concept of 
self do not involve distinguishable differences 
in emotionality and his conclusion is supported 
by the data of this experiment. This is a re- 
sult which might be predicted logically since 
differences in concept of self should not involve 
emotionality unless a discrepancy exists be- 
tween the concept of self and the concept of 
the ideal self or the subject is rejecting of 
self because of his concept of self. Since differ- 
ent ratings on concept of self do not involve 
different degrees of emotionality, it was pre- 
dicted that changes in ratings on concept of 
self from test to retest would not involve 
changes in emotionality. The results of the 
experiment upheld this prediction. 

Roberts demonstrated also that comparisons 
of high and low ratings on acceptance of self, 
and absence or presence of discrepancy between 
the concept of self and the concept of the ideal 
self, revealed significantly different degrees of 


137 


emotionality. This conclusion was substantiat- 
ed by the present experiment. Since the vari- 
ables of concept of self and discrepancy are 
indicators of emotionality, it was predicted 
that changes in ratings on acceptance of self 
or changes in discrepancy scores from test to 
retest should be accompanied by changes in 
emotionality. The data of this study support 
this hypothesis, and thus it may be concluded 
that changes in ratings of acceptance of self or 
discrepancy on The Index of Adjustment and 
Values are accompanied by changes in the 
emotionality of the traits for the subjects. 


The results of the study clearly demonstrate 
that when ratings are lowered from test to re- 
test, they are accompanied by decreases in free- 
association reaction time which are significant- 
ly greater than the decreases associated with 
ratings which are raised from test to retest. 
This is the opposite of what was predicted. It 
may be reasoned, though, that the ability to 
lower a trait rating on retest may be indicative 
of a lower degree of defensiveness, and it 
would be predicted that such a reaction would 
be accompanied by a decrease in emotionality 
and, therefore, by a decrease in reaction time 
to the trait in free association. 


Regardless of the reason, the data permit 
the conclusion that changes in trait ratings on 
the Index of Adjustment and Values from test 
to retest are accompanied by changes in the 
emotionality of the traits for the subjects. The 
lack of complete reliability of personality tests 
has usually been ascribed to sampling errors, 
errors of measurement, and perhaps to chang- 
es in the subject. This study has demonstrated 
that a statistically significant portion of the 
lack of complete reliability of the Index of 
Adjustment and Values in a test-retest situa- 
tion is due to changes within the subjects. 
These data offer further support for the as- 
sumption that the Index is a valid measure of 
emotionality and changes in emotionality. 


Summary 


At the beginning of a semester fifty volun- 
teer students were tested with the Index of 
Adjustment and Values and a free association 
test which used the traits of the Index as stimu- 
lus words. Fourteen weeks later the subjects 
were retested with both measures. 

On the basis of the first tests it was deter- 








138 Robert E. Bills 


mined that Roberts’ conclusions regarding the 
Index of Adjustment and Values as a valid 
measure of emotionality were supported. 
From the test-retest data it was concluded 
that changes in trait ratings from test to re- 
test are accompanied by changes in the emo- 
tionality of the traits for the subjects and that 
ratings on the Index of Adjustment and Values 
are valid measures of changes in emotionality. 


Received July 8, 1952. 


References 


1. Bills, R. E., Vance, E. L., & McLean, O. S. An 
Index of Adjustment and Values. J. consult. 
Psychol., 1951, 15, 257-261. 

2. Roberts, G. E. A study of the validity of the 
Index of Adjustment and Values. J. consult. 
Psychol., 1952, 16, 302-304. 





ered of Consulting Psychology 
ol. 17, No. 2, 1953 


The Madeleine Thomas Completion Stories Test 


Eugene S$. Mills 
Whittier College 


In 1949 the writer made a survey of fifty 
psychological clinics and child guidance cen- 
ters to determine the extent to which the 
Madeleine Thomas Completion Stories Test 
was being used in clinical practice. All but two 
of the clinics reported that they were unfamil- 
iar with the test and about half of them asked 
for specific information relating to the test and 
its application. A brief report [1] of this test 
has been made previously, but it is felt that a 
further report is warranted by the interested 
response to the survey questionnaire. 

The Madeleine Thomas Completion Stories 
Test (MTT) was first presented in an ar- 
ticle by Madeleine Thomas [3] in 1937. 
The development of the test was the work of 
the J. J. Rousseau Institute and of the Psy- 
chological Laboratory of Geneva. Helmut 
Wursten, clinical psychologist for the Child- 
ren’s Hospital in Los Angeles, made a trans- 
lation of the test which has been used at that 
hospital and at the Claremont Graduate 
School, Claremont, California, under Flor- 
ence Mateer. Using Wursten’s translation of 
the MTT items, the present writer made an 
exploratory study [2] of the test by ‘administer- 
ing it to fifty elementary school children be- 
tween the ages of five and fourteen years. To 
the best of the writer’s knowlege, this is the 
only study of the test to be reported in a psy- 
chological journal since its publication in 1937. 


Description of the MTT 


The originator of the test offers as a basis 
for the MTT the hypothesis that all im- 
aginary creation obeys a certain determinism 
and that one can, when in possession of such a 
creation, infer the psychological causes from 
which it springs. The MTT is a clinical tool 
for releasing this creativity. It consists of fif- 
teen stories or items which relate to the family 
conditions, school experience, and fantasy life 


of a fictitious little boy or girl of the same sex 
and age as the child being tested. Each story 
poses a problem which is left in suspense, and 
the child is asked to resolve that problem; 
that is, he is asked to finish the story to his own 
liking. The fifteen stories follow: 


1. A boy (or girl) goes to school. During recess 
he does not play with the other children. He stays 
all by himself in a corner. Why? 

2. A boy fights with his brother. Mother comes. 
What is going to happen? 

3. A boy is at the table with his parents. Father 
suddenly gets angry. Why? 

4. One day Mother and Father are a bit angry 
with each other. They have been arguing. Why? 


5a. Sometimes he likes to tell funny stories to his 
friends. What kind? 


5b. Sometimes he likes to tell funny stories to his 
parents. What kind? 

6. A boy has gotten bad grades in school. He re- 
turns home. To whom is he going to show his re- 
port card? Who is going to scold him most? 

7. It is Sunday. This boy has been taken for a 
ride with Mother and Father. Upon their return 
home, Mother is sad. Why? 

8a. This boy has a friend whom he likes very 
much. One day his friend tells him, “Come with me, 
I am going to show you something, but it is a secret. 
Don’t tell anybody.” What is he going to show 
him? 

8b. This boy has a friend whom he likes very 
much, One day his friend tells him, “Listen, I am 
going to tell you something, but it is a secret. Don’t 
tell anybody.” What is he going to tell him? 

9a. It is evening. The boy is in bed, the day is 
ended, the light turned off. What does he do? 

9b. What is he thinking about? 

9c. One evening he cries; he is sad. What about? 

10. Then he goes to sleep. What does he dream 
about ? 

11. He wakes up in the middle of the night. He 
is very much afraid. What of? 

12. He goes back to sleep, and this time has a 
very nice dream. A good fairy comes to him and 
says, “I can do anything for you! Tell me what you 
want. I am going to touch it with my magic rod 


139 





140 


and all you may wish for is going to come true!” 
What does he ask for? 

13. The boy is growing up. Is he anxious to be 
a big boy soon, or would he rather remain a little 
boy for a while? 


14. Among all the fairy tales that have been told 
him, which one does he like the best of all? 


15. Do you remember when you were a little 
boy? What is the first thing that you can remem- 
ber now? 


Administration and Theme Analysis 


The MTT is administered in a room which 
is as free as possible from distracting con- 
ditions. After rapport has been established 
with the child, the examiner begins with 
words to this effect: “We are going to tell 
some stories. I will start them and you will 
finish them. It isn’t hard. You will see.” 
Then the examiner reads the first story in a 
slow, even voice and finishes with an ex- 
pectant attitude. If the child’s response is 
only perfunctory or seems to leave a great deal 
unsaid, the examiner uses a follow-up question 
or comment such as, “Why?” or, “And then 
what happened?” In this manner, it is often 
possible to extend the scope of the response to 
include highly significant material which might 
otherwise have been left unrevealed. The re- 
sponses are recorded verbatim. 


The MTT requires, on the average, about 
20 to 25 minutes for administration. This de- 
pends, of course, upon such factors as the 
speed of response, the length of the response, 
and the amount of resistance encountered ; 
some children require only about 15 minutes to 
complete the items. 

There are, as yet, no well-established meth- 
ods for analysis of the MTT; however, 
methods for processing other types of projec- 
tive material seem applicable. The writer has 
used the following general approach with 
some success. First, a survey of the test re- 
sponses is made, emphases and trends being 
noted. Second, each response is studied separ- 
ately. An attempt is made to determine the 
dynamics and orientation of the response (i.e., 
sibling rivalry, directed toward a younger sis- 
ter; parent-child tension, directed toward the 
mother). These themes are then recorded in 
a few words on the test form or on a separ- 
ate piece of paper along with the story num- 





Eugene S. Mills 


ber. Third, the responses are then scrutinized 
for repetitions, unusual responses, and re- 
sponses which stand in striking contrast to 
the general theme developed in the test. 
Fourth, the most important themes are then 
studied in relation to any biographical mater- 
ial about the child which is available. In this 
way, it is often possible to discover many 
significant facts about the emotional life of 
the child in a relatively short time, depend- 
ing, of course, upon the skill and insight of 
the clinician, and the extent to which he de- 
sires to analyze the test responses. In some 
cases a selective use of certain items has 
proved helpful, especially when the clinician 
suspects difficulty in specific areas of the child’s 
emotional life. This flexibility would appear to 
be one of the major advantages of the tech- 
nique. 


With what ages is the test effective and 
what kind of material is revealed by its use? 
The writer has found the test to be useful be- 
tween the ages of six and thirteen years, with 
its greatest effectiveness occurring with child- 
ren from eight to eleven years, inclusive. As 
with many other tests, much of the effective- 
ness of the MTT depends upon the responsive- 
ness of the child being tested. 


In order of the frequency of occurrence, 
the writer has found the following themes 
revealed by the MTT. The numbers following 
each theme refer to the stories which seem 
to most effectively elicit that particular theme. 


. Manners and moral conduct—3, 8a, 9a. 
. Fantasy life—11, 10. 


. Parent-child tensions—3, 6. 


. Anxiety—11, 15, 9c. 
. Social adjustment, good and bad—1, 9c. 


. Likes—12, 13. 

8. Parental conflict—4, 7. 

9. Sibling rivalry—2, 3. 

10. Aggression—1, 13, 9c. 

11. Home stability and instability—4, 11. 
12. Sex awareness—13, 8b, 9a. 

13. Parent-child understanding—6, 3. 
14. Escape—1, 85. 

15. Fears—13, 11, 6. 

16. School conduct—1, 95, 12. 

17. Teacher-child relationships—1, 6. 
18. Deep loss—15, 10. 


1 
2 
3 
4. Parental discipline—2, 6, 9c. 
5 
6 


Other significant indications which may ap- 





Madeleine Thomas Completion Stories Test 141 


pear are resistance and distraction. While not 
directly revealing the nature of the child’s 
emotional life, these indications are often 
significant when one is cognizant of the 
specific story-items to which the child is re- 
sponding. 

It is hoped that this brief description will 
serve to acquaint clinicians with the Made- 
leine Thomas Test and perhaps stimulate 
further research. 


Received August 26, 1952. 


References 


Mills, E. S. The Madeleine Thomas Test as an 
aid in reading children. Fourteenth Yearbook, 
Claremont College Reading Conference, Clare- 
mont, California, 1949. 

Mills, E. S. A study of the Madeleine Thomas 
Completion Stories Test with fifty elementary 
school children. Unpublished master’s thesis, 
Claremont College Library, Claremont, Cali- 
fornia, 1949. 

Thomas, Madeleine. Methode des histoires a 
compléter. Arch. Psychol., Genéoe, 1937, 26, 
209-284. 





igre of Consulting Psychology 
ol. 17, No. 2, 1953 





MMPI Profiles and Personality Characteristics 


H. Birnet Hovey 
Veterans Administration Hospital 
Fort Douglas Station, Salt Lake City, Utah 


This study is concerned with comparing ob- 
served personality characteristics of nonclinical 
individuals with profiles they produced on the 
Minnesota Multiphasic Personality Inventory 
(MMPI) [1]. Contrary to the usual proced- 
ure of comparing test results with groups of 
individuals selected according to certain be- 
havior criteria, this study started by grouping 
persons according to test results, and then as- 
certaining what personality characteristics or 
traits appeared, a procedure similar to the one 
used in compiling 4n Atlas for the Clinical 
Use of the MMPI [2]. The population con- 
sisted of student nurses in practicum training, 
not because of a primary interest in nurses, 
but because data for the most part were al- 
ready available and only waiting to be or- 
ganized and analyzed. The population con- 
sisted of individuals who were making con- 
structive social and vocational adjustments, 
who had been under discerning supervision of 
experienced psychiatric nurses and other staff 
personnel, and for each of whom anecdotal 
records with expressions of behavioral and per- 
sonality characteristics had been made inde- 
pendently by eight of these personnel. 


The program in the Veterans Administra- 
tion Hospital at American Lake, Washington, 
carries on an intensive practicum for student 
nurses, admitting a class of from 16 to 24 stu- 
dents at a time, for a three-month period. 
Within the first day or two, the MMPI is 
administered with other tests and interviews. 
Then the students are separated into small 
groups and assigned to four supervisors in ro- 


1From the Veterans Administration Hospital, 
American Lake, Washington. Dr. James C. Stauf- 
facher, Chief Clinical Psychologist there, gave val- 
uable suggestions in the preparation of this study, 
and Prof. Allen L. Edwards, University of Wash- 
ington, gave essential advice on some of the sta- 
tistical problems. 


tation. These supervisors record impromptu 
impressions of each student every few days, in 
addition to using a rating scale. Also, four ad- 
ditional staff personnel, under whom each 
student is observed in special assignments, 
record comments. These various notes are sent 
to central files. 


Procedure 


Copious notes and MMPI answer sheets 
were available for study on 97 former stu- 
dents. The MMPI scores were computed, 
using K (and omitting Mf). There turned 
out to be 92 personality characteristics men- 
tioned by at least two supervisors in describ- 
ing students. One-half represented assets, while 
the other half represented liabilities approxi- 
mately opposite to the assets.2 Tetrachoric 
correlation coefficients (tr) were computed be- 
tween scores on MMPI scales and frequencies 
of characteristics or traits as mentioned by su- 
pervisors. For one scale at a time, the distri- 
bution of subjects in terms of scores was di- 
vided dichotomously at the 75th percentile. 
Whenever a dividing point could not be made 
close to this percentile because of a few sub- 
jects having the same score in that region, 
these few cases were separated according to 
the relationship of subject’s score on that scale 
to her own mean score for all the scales. (For 
instance, if three subjects had identical scores 
cutting the distribution to one side or the other 
of the 75th percentile on a scale under con- 
sideration, and one of the cases was needed in 


2 Whenever a given asset had been mentioned by 
one supervisor and its approximate opposite by an- 
other supervisor, neither was used. This occurred 
infrequently. Far more frequently the items had 
been mentioned by at least two supervisors. In 
fact over 40% of all the entries used in the study 
were supported by the same entries on the same 
individuals being made by at least two different 
supervisors. 


142 





pt tr Pea amb 


MMPI Profiles and Personality Characteristics 143 


the high-scoring group to round it out closer 
to a quarter of the distribution, the subject 
with the lowest general profile level would be 
included as her score on that scale would be 
relatively higher than those of the other two 
subjects in terms of their over-all profiles.) 
In computing an r, the top 25%, or high score, 
was compared with the remainder of the dis- 
tribution for proportions in which a trait oc- 
curred. The same procedure was used for low 
score, this time using the 25th percentile as 
the dividing point. Divisions were made near 
the ends of distributions rather than at the 
centers, because of interest in what the more 
extreme scores might represent, analogous to 
an interest in superior intellect or in mental 
deficiency. 

Comparisons were made for high and low 
scores on all eight of the clinical scales, with 
each of the 92 traits. High and low L, K, 
and mean score were also included. There 
were too few ? scores above zero, and the 
spread in F scores was too narrow to include 
these in the comparisons. For each scale and 
in both directions, the traits were listed which 
correlated with it at or above the .05 level of 
confidence. Coefficients which failed to reach 
this level were dropped from further consid- 
eration. In addition to traits, the MMPI scales 
were compared with elements evaluated by the 
supervisors on the formal rating scale. 

These procedures were repeated on a new 
group of 40 student nurses. Only correlations 
between MMPI scale scores and personality 
characteristics which reached the .05 level or 
better both times were considered as sufficient- 
ly significant for reporting here. The new 
group contained too few low K scores for use 
in computations. 


Results 


For the 92 traits considered in this study, 
there was a possibility of 1932 coefficients of 
relationship, of which 115 turned out to have 
tr’s at the .05 level of significance or higher, 
for both the original group of 97 students and 
the new group of 40 students. Below are listed 
deviations in the positive and negative direc- 
tions for the various MMPI scales, and the 
related traits. The traits are listed in descend- 
ing order according to size of tr. The chi- 
Square test (y*) was applied, and those items 


which reached the .05 level or better for both 
the original and new samples are italicized. 
(Some findings not supported by x’ nor listed 
are discussed in footnote 3.) An asterisk (*) 
denotes that the relationship is supported by a 
significant one, at the .05 level, in the opposite 
direction for the group scoring opposite on the 
same scale. All remaining items in the list are 
similarly supported but not significantly so. 
When a trait in the list is followed by “— 
minus,” a negative r is denoted. However, 
when significantly positive and negative rela- 
tionships obtained for traits with the two ex- 
tremes of scales respectively, only the positive 
ones are listed, the negative ones being implied 
by the asterisks. 


High L score: *Afraid of mental patients, *Has 
difficulty expressing self orally, Persevering, Pas- 
sive, *Feels insecure about self 

Low L: *Poised and at ease around others, Not in- 
dustrious, Friendly 

High K: Reserved, Mature, Afraid of mental pa- 
tients, Has difficulty expressing self 
Friendly 

Low K: (Too few cases in the new group for sta- 
tistical treatment) 

High Hs: *Adjusts slowly, Alertness—minus, Ease 
of oral expression—minus 

Low Hs: (None other than opposite of first item for 
High Hs) 

High D: *Shy, *Nonaggressive 

Low D: *Initiative, *Poised and at ease around 
others, Participates actively in group discussions, 
*Adjusts rapidly, *Good socializer, *Emotionally 
stable, Efficient, *Ease of oral expression, Desires 
responsibilities . 

High Hy: Immature, *Friendly, Enthusiastic, Care- 
less in personal appearance, *Cooperative, Cheer- 
ful 

Low Hy: Industrious—minus, Self-confident 

High Pd: *Participates actively in group discus- 
sions, *Initiative, *Aggressive, Desires res ponsi- 
bilities, Not industrious, Adjusts rapidly, Self- 
confident, Shy—minus, *Unafraid of mental pa- 
tients, Works persistently with assigned patients 
—minus, Enthusiastic 

Low Pd: *Suggestions accepted willingly, *Perse- 
vering, Stimulating personality—minus 

High Pa: *Lacks self-confidence, Dependent and 
submissive, Outgoing—minus 

Low Pa: Adjusts rapidly, Dependable—minus, 
*Poised and at ease around others 

High Pt: *Participates little in group discussions, 
Poor socializer, Shy, Ingenious—minus, Neat in 
personal appearance 

Low Pt: (None other than opposite of first item 
for High Pt) 

High Sc: Participates actively in group discussions, 


' 
orally, 





144 H. Birnet Hovey 


Ingenious, Initiative, Poor judgment, Not en- 
thusiastic 
Low Sc: Friendly, Alert 
High Ma: *Ease of oral expression, *Initiative, 
*Ingenious, Self-confident, Efficient, Conscientious 
—minus, Reserved—minus, Responsible, Effective 
with mental patients, Shy—minus, Immature, 
Poised and at ease around others, Leadership 
qualities 
Low Ma: *Participates little in group discussions, 
Adjusts slowly, Persevering 
High ave: *Initiative, Ease of oral expression— 
minus 
Low ave: Reserved 
In view of the above results, the possibility 
of predicting final grades achieved in the course 
from MMPI scores was investigated. High 
and low deviation on each scale, high and low 
mean score, wide and narrow spread between 
scale scores were tried, but significant relation- 
ships were not found. The lack of significant 
relationships might be due to the students hav- 
ing already been screened for motivation and 
aptitudes connected with nursing and there- 
fore not representing an unselected sample. It 
is also possible that some of these persons may 
have adequate counterbalancing assets that en- 
able them to function effectively in spite of 
certain personality weaknesses, and such assets 
may or may not be reflected in the MMPI 
profiles. The lack of success for prediction of 
grades from profiles was about the same as 
Weisgerber’s [3]. Also, in agreement with 
his study, there was little success in discover- 
ing the students who needed special counseling. 
A special scale for predicting grades was con- 
structed, based on item analyses of 4 versus D 
students. But when the scale was applied to the 
new group of 40 students, prediction turned 
out to be little better than chance. 


Discussion 


The primary objective of the study was to 
ascertain what kinds of personality character- 
istics might be related to high and low scores 
on the various scales of the MMPI, for a 
group of normal individuals. The more sig- 
nificant relationships are listed above.* Spon- 


3In most instances and as might be expected, a 
trait correlating with a scale did so in opposite di- 
rections for high and low scores for that scale. The 
list contains these instances. However in some other 
instances, traits showed up as associated (according 
to tr technique) with deviation scores in one direc- 
tion only for a given scale. These are not in the 


taneous notes made by the supervisor-observers 
were more discriminating than formal ratings 
made on a rating scale, the spontaneous charac- 
terizations correlating more significantly with 
MMPI scores. In fact, no significant r’s 
emerged for relationships between the scales 
and the formal ratings. This may be due to 
the formal ratings’ containing more compre- 
hensive judgments, or possibly because they 
were not on-the-spot observations, or perhaps 
because they were used for purposes of motiva- 
tion. Our findings here correspond also with 
Weisgerber’s [3]. 

The list of characterizations not only reveals 
differences within one group of individuals but 
has also shown itself to have predictive value 
when applied to a new and similar group. So 
far, the list for predictions has not been used 
in individual cases, but such a study has been 
initiated. For groups at least, MMPI profiles 
can be used to predict personality character- 





list. For instance high Pa correlated significantly 
with “persevering” but there was also a small posi- 
tive correlation of this trait with low Pa. “Friend- 
ly” and “ingenious” were associated with low Hs, 
yet high Hs was not at all associated with a lack of 
these traits. High D was negatively correlated with 
“friendly” which means that supervisors noted rela- 
tively few of the individuals with elevated scores 
on this scale as being especially friendly, yet at the 
same time the high D groups did not contain any 
more than chance expectancy of subjects noted as 
being on the unfriendly side. 

According to tr coefficients obtained, a scale ele- 
vation might mean either strength or else manifest 
weakness of the same characteristic. High Sc cor- 
related positively with “participates actively in 
group discussions” and also with “participates /ittle 
in group discussions.” When high Sc was accom- 
panied by high Pd and Ma, the individuals con- 
cerned tended to participate actively, whereas if 
neither Pd nor Ma were high, they were prone not 
to participate. These relationships held for 13 pro- 
files dominated by the three scales, versus 9 profiles 
with both Pd and Ma being several points below 
Sc. The tally was 11 to 1, and 2 to 3, for “actively” 
to “little,” for the two groups respectively. Again, 
when high Sc profiles for 18 individuals who par- 
ticipated actively were compared with 10 who did 
not, the medians for the latter group were nine 
points below the former on Pd and four points be- 
low it on Ma. Individuals with elevated Sc scores 
may be considered as having schizoid trends, and 
they may be inclined either to withdraw from group 
participation, or else be dominant in the group. One 
student who obtained a high Sc score accompanied 
by elevated Pd and Ma scores was characterized as 
“participating actively,” and otherwise was noted to 
be very aggressively outspoken in class and to 
change the topic. 








Seles 


MMPI Profiles and Personality Characteristics 145 


istics which are not necessarily features of 
emotional illness. However, a word of caution 
may be in order should one try to predict those 
characteristics found in our study for other 
kinds of groups. For instance, in spite of our 
coefficients reaching significance in both our 
samples, they might apply only to student 
nurses who have achieved advanced training 
status in the Pacific Northwest, and who are 
observed by the particular supervisors who 
have been participating in our study. 

The list reveals that some of the traits are 
correlated with more than one scale deviation. 
In such instances, the occurrence of two or 
more of the involved deviations in a sample 
should increase the proportion of the trait ap- 
pearing in that sample. There were too few 
cases of the various kinds of p: files for com- 
prehensive statistical analysis, but one sample 
tends to bear it out. There were 26 profiles 
with the two highest scores being on Pd and 
Ma, which was the largest number of profiles 
dominated by any two particular scales. These 
two scales are correlated with “participates ac- 
tively in group discussions.” The group of in- 
dividuals producing those 26 profiles contained 
almost half again as high a proportion of per- 
sons with this characteristic as did two groups 
dominated by Pd and Ma separateiy. On the 
other hand, if a trait is correlaied positively 
with one scale deviation and negatively with 
another, concurrence of the two deviations 
should tend to cancel so far as that trait is con- 
cerned. “Participates little in group discus- 
sions’ happens to correlate with high Pt. There 
were 16 profiles dominated by elevations on 
this scale together with Pd or Ma but not 
both. The group producing these profiles con- 
tained no more of the trait within it than 
chance expectancy. 

Deviations on every one of the individual 
scales were more loaded with significant rela- 
tionships than in mean score or general level 
of profile. Only three traits correlated with 
high and low mean score, whereas the various 
scales excepting Hs carried from 5 to 16 trait 
relationships. Also, none of the items related 
to high and low mean score is supported by 
% while every one of the scales carries one or 
more such items. This may have something to 
do with the inability to select effectively from 
total score students in need of counseling, or 


to predict grades better than chance. In this 
study, shape of profile was more fruitful than 
its general eievation. 

Asset traits as well as liabilities may be as- 
sociated with elevations on MMPI scales, and 
negatively valued traits may be associated with 
subaverage scores on the scales. From one to 
seven positive traits were associated with 
elevated scores on seven of the ten scales. One 
positive trait was even associated with a high 
over-all mean score. Conversely, from one to 
five negative traits were associated with low 
scores on all scales but Hs, Pt, and Sc, and also 
a negative trait was related to low mean score. 
When only those items supported by 7’ are 
considered, the elevated scores are related to 
twelve “desirable” and eight “undesirable” 
traits. By the same criterion, the low scale 
scores are related to thirteen positive traits 
and six negative ones. Elevated scores in clini- 
cal practice tend to signify various kinds of 
maladjustment or potentials for it, and in view 
of this, it may be inferred from our results 
that there are some personality assets associat- 
ed more with maladjustment potentials as 
measured by the MMPI than with relative 
freedom from such potentials. 


Summary 


A group of 97 student nurses in practicum 
training had been given the MMPI, and dur- 
ing the training period the supervisors made 
notes relating to personality characteristics for 
each student. Tetrachoric r and also 7? were 
applied to ascertain any relations between high 
scores, and also low scores, on the various 
MMPI scales and observed personality char- 
acteristics. The procedure was then applied to 
a new group of 40 student nurses. Traits are 
listed which correlated most significantly with 
one or more of the scales for both groups of 
students. 

Impromptu notes made by supervisors 
showed more significant relationships with 
MMPI scale scores than did ratings made on 
the rating scales. Clinical observations to the 
effect that for interpretation, deviations on 
individual scales must be considered along 
with deviations on other scales, are supported 
by the data. Individual scores on most scales 
showed up as more meaningful than general 
elevation of profile. Some traits of positive 





146 H. Birnet Hovey 


value as well as ones of negative value were 
found to be associated with elevations of vari- 
ous scales, and some negative traits were re- 
lated to low scale scores. In other words, po- 
tential for emotional maladjustment may carry 
along with it or enhance some positive person- 
ality characteristics. 


Received September 22, 1952. 


References 


Hathaway, S. R. & McKinley, J. C. Minne- 
sota Multiphasic Personality Inventory. (Rev. 
Ed.) New York: Psychological Corp., 1951. 
Hathaway, S. R., & Meehl, P. E. An atlas for 
the clinical use of the MMPI. Minneapolis: 
Univer. of Minnesota Press, 1951. 

Weisgerber, C. A. The predictive value of the 
Minnesota Multiphasic Personality Inventory 
with student nurses. J. soc. Psychol., 1951, 33, 
3-11. 





§ 
t 





tial Acs 


Journal of Consulting Psychology 
Vol. 17, No. 2, 1953 


A Comparison of The WISC and Stantord-Binet 
IQ’s of Normal Children 


Glen A. Holland’ 


University of California, Los Angeles 


An appreciable literature has begun to 
accumulate concerning the Wechsler Intelli- 
gence Scale for Children. Comparison of this 
scale with the Stanford-Binet is inevitable. 
One reason for this is that the Stanford-Binet 
in its original and revised forms has consti- 
tuted a standard for intelligence measurement 
for 36 years. Also, its frequent use in a wide 
variety of situations has provided individual 
testers with operational meanings for its 
scores over and beyond those given by its 
standardization procedure. An obvious econo- 
my of effort is involved if the correlation of 
the Stanford-Binet with a new scale appears 
to warrant some degree of generalization from 
this accumulated experience. 


Up to June, 1952, six studies have been 
reported comparing the Stanford-Binet with 
the WISC using normal subjects [1, 2, 5, 11], 
two of these being unpublished Master’s 
theses reported by Pastovic and Guthrie [5] 
and not seen by the writer. Three studies have 
been done using defective subjects ranging 
from the moron through borderline grades of 
intelligence [4, 8, 9]. The procedures, data, 
and conclusions reported suggest certain hy- 
potheses. These will be given below in num- 
bered statements along with relevant evidence. 


1. The order of testing (S-B first or WISC 
first) produces a statistically significant dif- 
ference in the mean IQ’s obtained with the 
WISC. Although this specific hypothesis has 
not been studied, it is implied in at least four 
studies in which the investigators report that 


they have alternated the order of testing 
[1, 2, 5, 8]. 


*The writer wishes to express his appreciation to 
the following persons who did testing for this study: 
Joyce Beak, Annette Lawton, Patricia Marshall, 
Marie O’Hara, Edward Ormsby, Virginia Sand- 
borg, Gordon Southon, and W. G. Willis. 


2. The correlations with Stanford-Binet 
IQ’s will be significantly greater for WISC 
Full Scale and WISC Verbal Scale than for 
WISC Performance Scale 1Q’s. Relatively 
clear-cut results to this effect are given by 
three studies [1, 8, 11]. Two others have not 
reported relevant data [4, 9]. Pastovic and 
Guthrie [5] cite one study in which this hy- 
pothesis receives support and three others in 
which the results are ambiguous. Krugman, 
Justman, and Wrightstone [2] found that for 
their total group of 332 subjects the corre- 
lations with Stanford-Binet 1Q’s for the Full 
Scale and Verbal Scale were higher than for 
the Performance Scale. However, at some age 
levels there was little difference between the 
correlations for Verbal Scale and Performance 
Scale with S-B IQ, and in a few instances 
the Performance Scale correlation was 
greater. 


3. The correlation with Stanford-Binet 
IQ’s will not be significantly different for 
WISC Full Scale and WISC Verbal Scale 
IQ’s. Two studies report practically identical 
correlations between these two WISC scales 
and the Stanford-Binet [8, 9].Weider, Noller, 
and Schramm [11] found identical correla- 
tions for their subjects as a whole, but when 
the group was divided with respect to age a 
difference appeared for the younger group 
(r= .82 for the Verbal Scale, r—=.90 for 
the Full Scale). Pastovic and Guthrie [5] 
indicate approximately equal correlations with 
the Stanford-Binet for the two Scales in one 
study which they report but find differences 
of from .06 to .08 in three others. The dif- 
erence between the correlations is .09 in the 
data of Frandsen and Higginson [1] and .18 
for the total group of subjects studied by 
Krugman, Justman, and Wrightstone [2]. 


4. There is a significant relationship be- 


147 





- 
| 


148 Glen A. Holland 


tween the age of the subjects tested and dif- 
ferences in Stanford-Binet and WISC IQ’s. 
Krugman, Justman, and Wrightstone [2] 
found a significant relationship between chron- 
ological age and differences between S-B and 
WISC IQ’s for both the Verbal Scale and 
the Full Scale. A chi-square analysis indicated 
that the differences were greater for their 
younger subjects, with the WISC IQ’s being 
lower. Pastovic and Guthrie state, “We con- 
clude that the WISC should not be interpre- 
ted as equivalent to a Binet IQ at age levels 
below 10 years since the WISC score is con- 
sistently lower than that of the Binet” [5, 
p. 385]. Still more evidence comes from the 
work of Weider, Noller, and Schramm [11], 
who found significant differences in mean IQ 
between the Stanford-Binet and the WISC 
for subjects up to 7 years, 11 months old, but 
relatively small differences for older subjects. 
Frandsen and Higginson [1] report differ- 
ences which are statistically significant for 
their group of fourth-graders ranging in 
chronological age from 9 years, 1 month to 
10 years, 3 months. The three studies of de- 
fective subjects with mean chronological ages 
of 11 years, 11 months or older [4, 8, 9] have 
agreed in finding relatively small differences 
between mean S-B and WISC IQ’s with the 
WISC IQ's being consistently higher. Nale 
[4] reports, however, that the difference of 
2.59 IQ points which he found is significant 
at the .01 level of confidence. 

5. There is a significant relationship be- 
tween the size of Stanford-Binet IQ’s and 
the difference between Stanford-Binet and 
WISC IQ’s. Such a relationship is reported 
by Krugman, Justman, and Wrightstone [2], 
who correlated S-B IQ’s with the differences 
between S-B and WISC IQ’s for all three 
of the WISC Scales. The three product- 
moment correlations obtained were all signifi- 
cantly greater than zero. 

Although not pertinent to the comparison of 
Stanford-Binet and WISC IQ’s, two addi- 
tional hypotheses of some interest are sug- 
gested by the studies done to date. 

6. There is a significant difference between 
mean WISC Verbal Scale and Performance 
Scale IQ’s. This is the contention of Pastovic 
and Guthrie who state , “ .. . our data and 
those of three other studies cited, find a 


higher mean performance IQ than verbal IQ 
over a wide range of intelligence, although 
the WISC was intended to have a difference 
of zero points between these averages” [5, 
p. 385]. Seashore [7], however, on the basis 
of the test data for the entire WISC stand- 
ardization sample, found zero difference be- 
tween the mean IQ’s of the Verbal and Per- 
formance Scales. 


7. There is a significant correlation between 
WISC Full Scale IQ’s and the difference be- 
tween WISC Verbal Scale and Performance 
Scale 1Q’s. The evidence on this point is weak, 
being based on the observation of defective 
subjects only. Seashore [7] reports that for 
55 feebleminded subjects in the WISC stand- 
ardization sample the mean “discrepancy 
score’ (Verbal Scale IQ minus Performance 
Scale IQ) was a —2 points. A difference in 
mean IQ of 4.9 points (Performance Scale 
mean IQ greater) was found by Sloan and 
Schneider [8]. A difference in the same direc- 
tion of 5 IQ points is reported by Stacey and 
Levin [9]. For both of these latter compari- 
sons the difference in means is statistically 
significant at the .001 level of confidence 
(computed by this writer). Whether subjects 
of superior intelligence have higher Verbal 
Scale than Performance Scale IQ’s has not 
yet been reported, but the possibility seems 
worth investigating. 


Procedure and Subjects 


The testing for the present study was done 
by a small group of senior and graduate stu- 
dents of the University of California at Los 
Angeles. These students had completed pre- 
vious to the study or were in process of com- 
pleting concurrently at least a year’s work in 
intelligence testing, including a semester of 
intelligence test practice under the supervision 
of the writer. 


These students were instructed to examine 
at least ten subjects of varying ages with both 
the Stanford-Binet Intelligence Scale and the 
Wechsler Intelligence Scale for Children. 
Half of their subjects were to be given the 
Stanford-Binet first and half the WISC first. 
The test booklets were then checked by the 
writer for scoring errors and evidences of 
errors in administration. From the test book- 
lets which remained after this screening, 26 





eno 








WISC and Stanford-Binet IQ’s of Normal Children 


were chosen representing subjects who were 
given the WISC first. The ages of the sub- 
jects ranged from 5 through 13 years with 
a mean of 9.3 years and a standard deviation 
of 2.5 years. Seventeen were boys and 9 were 
girls. ‘The mean Stanford-Binet IQ for this 
group was 113.7 (SD 15.8). An additional 
26 test booklets were then chosen representing 
subjects who had received the Stanford-Binet 
first, the selection being made in such a way 
as to reproduce so far as possible the statisti- 
cal characteristics of the group mentioned 
above. For the second group the mean chron- 
ological age was also 9.3 years and the SD 
2.5 years. Fourteen of this group were boys 
and 12 were girls. The mean Stanford-Binet 
IQ was 113.7 (SD 15.5). After the equated 
groups were made up, an additional 20 records 
remained representing subjects who had been 
tested with the S-B first. These were added 
to those mentioned above for the purposes of 
the chi-square analyses to be reported. 


The longest interval between the two tests 
was 84 days (1 subject), and the shortest 
interval was 2 days (2 subjects). Most sub- 
jects were given the two tests a week apart, 
the median interval for the entire group be- 
ing 7.5 days. Forty-one of the subjects were 
tested with Form L of the Stanford-Binet and 
11 were tested with Form M. 


Results 


1. The effect of order of testing on WISC 
IQ’s. For the group examined with the WISC 
first the mean IQ’s obtained were 110.4 for 
the Verbal Scale, 110.8 for the Performance 
Scale, and 111.4 for the Full Scale. For the 
group examined with the Stanford-Binet first 
the corresponding mean IQ’s were 111.3, 
111.3, and 112.5. The ¢ values [6] for the 
differences between means were .22 for the 
Verbal Scale, .13 for the Performance Scale, 
and .27 for the Full Scale. None of the dif- 
ferences are statistically significant. 

2. and 3. The correlation of Stanford-Binet 
and WISC IQ’s. For the 52 subjects in the 
equated groups the product-moment correla- 
tion of the Stanford-Binet with WISC IQ’s 
was .88 for the Verbal Scale, .73 for the 
Performance Scale, and .87 for the Full 
Scale. The critical ratio for the difference 


149 


between the first two correlations, follow- 
ing the statistical procedure suggested by 
McNemar [3], is 2.96, and the probability 
that the correlations are equal is less than 
.002. The critical ratio for the differences 
between the second and third correlations is 
4.55, indicating a probability of less than .001 
that the correlations are equal. However, 
when the correlations of the Stanford-Binet 
with the Verbal Scale and with the Full Scale 
are compared, the critical ratio for the differ- 
ence in correlations is only .36. It is obvious 
that for this comparison the null hypothesis 
cannot be rejected with any appreciable de- 
gree of confidence. 

4. Age and differences between Stanford- 
Binet and WISC IQ’s. The statistical analy- 
sis of the relation between age and differences 
between Stanford-Binet and WISC I1Q’s 1s 
based on the records of 72 subjects. A2 X 2 
table was set up dividing subjects with respect 
to age at the 10-year level and with respect 
to differences between Stanford-Binet and 
WISC IQ's into those for which the differ- 
ence was 5 points or less and those for which 
the difference was 6 points or greater. The 
value of chi square for differences between 
Stanford-Binet and Verbal Scale IQ’s was 
1.006, which has a probability greater than 
.30 of occurring in samples in which there is 
no relationship between the variables com- 
pared. The value of chi square for Stanford- 
Binet and Performance Scale IQ differences 
was .082, p greater than .70. The value of 
chi square for differences between Stanford- 
Binet and Full Scale 1Q’s was 1.47, p greater 
than .20. None of the chi-square values 
warrants the rejection of the null hypothesis 
of no relationship between chronological age 
and differences in Stanford-Binet and WISC 
IQ’s. 

To provide data comparable with those of 
other studies, mean IQ’s were computed 
separately for subjects 10 years and older and 
for subjects less than 10 years old. The mean 
Stanford-Binet 1Q for 23 subjects 10 years 
or older was 115.3. The mean WISC IQ’s 
were 113.2 for the Verbal Scale, 112.9 for 
the Performance Scale, and 113.6 for the 
Full Scale. The ¢ test [6] indicates that none 
of the mean WISC IQ’s can be considered 
significantly different from the mean Stanford- 





150 Glen A. Holland 


Binet IQ for these subjects, the ¢ values being 
1.35 for the Verbal Scale, 1.00 for the Per- 
formance Scale, and 1.00 for the Full Scale. 
The mean Stanford-Binet IQ for the 29 
subjects in this study who were less than 10 
years old was 112.3. The mean WISC IQ's 
were 109.1 for the Verbal Scale, 109.6 for 
the Performance Scale, and 110.6 for the Full 
Scale. The corresponding ¢ values for differ- 
ences between mean Stanford-Binet and mean 
WISC IQ’s are 2.26 for the Verbal Scale, 
1.37 for the Performance Scale, and 1.41 for 
the Full Scale. Of these ¢ values, only the 
first approaches statistical significance, the 
probability being .012 of obtaining a differ- 
ence as large as that observed when there is 
no real difference in the means compared. 


5. Stanford-Binet IQ’s and differences be- 
tween Stanford-Binet and WISC IQ’s. The 
records of 72 subjects were used for the chi- 
square analysis of the relationship between 
Stanford-Binet IQ’s and the difference in 
Stanford-Binet and WISC IQ’s. Subjects 
were divided into those with Stanford-Binet 
IQ’s of 114 or greater and those with Stan- 
ford-Binet I1Q’s of less than 114. Each group 
was then divided into those with Stanford- 
Binet and WISC IQ differences of 6 points 
or more and those with differences of 5 
points or less. The value of chi square when 
Verbal Scale IQ’s were involved was .845 
with a p greater than .30 favoring the null 
hypothesis. Chi square when Performance 
Scale I1Q’s were involved was .716 with p 
greater than .30. Chi square when Full Scale 
IQ’s were involved was .258 with p greater 
than .50. None of the chi-square values indi- 
cates a significant relationship between Stan- 
ford-Binet IQ’s and the difference between 
Stanford-Binet and WISC IQ’s. 


6. The difference between WISC Verbal 
Scale and Performance Scale IQ’s. The data 
for 52 subjects in the present study show a 
mean difference of 0.0 between Verbal Scale 
and Performance Scale I1Q’s on the WISC. 
The standard deviation of the distribution of 
differences was 10.15 IQ points. The standard 
error of the mean difference is 1.42, which 
would indicate that for the universe repre- 
sented by this sample the probability is .01 
that the mean difference might be greater 


than 3.75, either plus or minus. 

When the differences between Verbal and 
Performance Stale IQ’s are taken without 
regard to algebraic sign the data of the 
present sample indicate a crude mode for the 
distribution of 4.5 points difference in IQ 
with 27% of all differences being either 4 or 
5 points. The median absolute difference was 
6.8 points and the mean absolute difference 
8.4 points. Q, for the distribution of absolute 
differences was 4.43, and Q; was 12.0. 

7. WISC Full Scale IQ’s and the differ- 
ence between Verbal and Performance Scale 
IQ’s. For 52 subjects the discrepancy scores 
(Verbal Scale IQ minus Performance Scale 
IQ) suggested by Seashore [6] were com- 
puted and correlated with WISC Full Scale 
IQ’s. The product-moment correlation was 
.14 with a standard error of .14. Such a corre- 
lation cannot be considered as significantly 
different from zero. 


Discussion 


The present study fails to support many 
of the trends noted in earlier studies of the 
WISC. Although several investigators have 
alternated the order of testing with the WISC 
and Stanford-Binet (and there is no reason 
for failing to do so in the future as a matter 
of precaution), the present report indicates 
that the order of testing does not have a sig- 
nificant effect on mean IQ for any of the 
WISC Scales. The data of this study agree 
with those of the majority of previous investi- 
gations in finding that the Stanford-Binet does 
not correlate as highly with the Performance 
Scale of the WISC as it does with the Verbal 
and Full Scales. They also indicate no sig- 
nificant difference in the correlation of the 
Stanford-Binet with the Verbal Scale and 
with the Full Scale IQ’s of the WISC, al- 
though previous information on this point has 
varied among different investigators. 

Contrary to the conclusions of at least two 
other investigations [2, 5] the present study 
fails to find clear-cut evidence of a relation- 
ship between the age of subjects tested and the 
difference in Stanford-Binet and WISC IQ’s. 
A chi-square analysis of absolute differences 
and age failed to show a significant relation- 
ship. Nor was it possible to find a significant 
relationship between the size of Stanford- 





ala we, see a 





fh 5 RR 


WISC and Stanford-Binet IQ’s of Normal Children 


Binet 1Q’s and the amount of difference be- 
tween Stanford-Binet and WISC IQ’s. 


The present study agrees with Seashore’s 
[7] report on the standardization sample of 
zero mean difference between Verbal and 
Performance Scale IQ’s and suggests that 
this difference cannot in all probability be 
greater than 3.75 IQ points in the universe 
represented by the sample studied. The tenta- 
tive hypothesis that there might be a relation- 
ship between WISC Full Scale IQ and dif- 
ferences between Verbal and Performance 
Scale 1Q for normal subjects was not substan- 
tiated. 

In considering the results of the studies of 
the WISC to date, the writer wishes to put 
forth some tentative suggestions. The first is 
that the use of the WISC with high-grade 
defectives appears to give results which are 
relatively unique to the defective group. (This 
is illustrated by the fact that for all reports in 
this area the mean Performance Scale IQ is 
consistently greater than mean Verbal Scale 
1Q.) The second is that generalizations about 
the WISC based upon the study of a relatively 
homogeneous age group may not apply to 
groups of subjects which are more heterogene- 
ous with respect to age. 


The question remains as to why the results 
of the present study appear to differ from 
those of the two studies employing normal sub- 
jects of various ages [2, 11]. So far as the 
study of Weider, Noller, and Schramm [11] 
is concerned, the present data agree on the ex- 
tent of correlation between Stanford-Binet 
and WISC IQ’s but differ with respect to 
the relation of chronological age to differences 
between Stanford-Binet and WISC IQ’s. 
The subjects used in the earlier study were 
somewhat younger than those in this study 
and had lower mean IQ’s. However, even 
when only subjects less than 8 years old from 
the present study are considered (to form a 
group comparable to the “younger” group in 
the previous study), there is still no signifi- 
cant difference between Stanford-Binet and 
WISC IQ’s. Neither was it possible to find 
a significant relationship between Stanford- 
Binet Mental Ages and Stanford-Binet— 
WISC IQ differences. The only remaining 
obvious difference in the nature of the two 


151 


studies lies in the fact that the Stanford- 
Binet and WISC were usually given either in 
two sessions on the same day or in a single 
session in the Weider, Noller, and Schramm 
study. This suggests the possibility of constant 
errors of measurement (particularly with 
younger subjects) based on fatigue, loss of 
interest, and similar progressive changes in 
the subject. There is no indication in the re- 
port that the order of testing was varied, 
which might become an important considera- 
tion when the interval between tests is quite 
short. 

Even this possibility does not exist to ac- 
count for the differences in results between 
the present study and that of Krugman, Just- 
man, and Wrightstone [2]. Their work was 
much the more extensive, including a total of 
332 subjects well distributed throughout the 
age range to which the WISC is applicable. It 
must be considered the definitive study of 
Stanford-Binet—WISC comparisons to date. 
Nevertheless, the fact that the present study 
failed to find a significant relationship between 
either chronological age or Stanford-Binet IQ 
and the difference between S-B and WISC 
IQ’s may serve to stimulate further research 
and to suggest caution in making generaliza- 
tions. 

Since the fact that student examiners gave 
the scales in the present study might suggest 
that this was a factor in the results obtained, 
the writer has compared the results of their 
testing with that of the standardization test- 
ing on two points where comparable data are 
available. As noted above, Seashore [7] found 
a mean difference of zero between Verbal and 
Performance Scale IQ’s and a standard de- 
viation for the differences of 12.5 points. He 
reports a median of 8 points for the absolute 
differences. The data of this study show a 
mean difference of zero points with a stand- 
ard deviation for the differences of 10.2 points 
and a median of 6.8 points for the absolute 
differences. For his subjects 10!/4 years old 
Wechsler [10] reports correlations of .68 
between Verbal and Performance Scale IQ’s, 
.93 between Verbal and Full Scales, and .90 
between Performance and Full Scales. For 
the present study (in which the mean age was 
9.3 years) the corresponding correlations are 


75, .94, and .92. 





152 


Summary 


Seventy-two subjects were given both the 
Wechsler Intelligence Scale for Children and 
the Stanford-Binet Revised Intelligence Scale. 
Twenty-six subjects were examined with the 
WISC first and could be equated, as a group, 
with 26 subjects examined with the Stanford- 
Binet first. A statistical analysis of the results 
indicated : 


1. No significant practice effect on WISC 
IQ’s when the Stanford-Binet was given first 
and the median interval between tests was 
7 days. 

2. Significant differences between the corre- 
lation of Stanford-Binet with Performance 
Scale IQ’s and the correlations of Stanford- 
Binet with either Verbal or Full Scale 1Q’s 
on the WISC. 

3. No significance for the difference be- 
tween the correlation of the Stanford-Binet 
with Verbal Scale and its correlation with 
Full Scale IQ’s on the WISC. 

4. No significant relationship between 
chronological age and the differences between 
Stanford-Binet and WISC IQ’s for any of 
the WISC Scales. 

5. No significant relationship between 
Stanford-Binet IQ and the differences be- 
tween Stanford-Binet and WISC IQ’s for 
any of the WISC Scales. 

6. Zero mean difference between WISC 
Verbal Scale and Performance Scale IQ’s with 
a standard error of the mean of 1.42 points. 

7. No significant relationship between 
WISC Full Scale IQ’s and the difference be- 
tween WISC Verbal Scale and Performance 
Scale 1Q’s. 


Received August 19, 1952. 





Glen A. Holland 


References 


1. Frandsen, A. N., & Higginson, J. B. The Stan- 
ford-Binet and the Wechsler Intelligence Scale 
for Children. J. consult. Psychol., 1951, 15, 
236-238. 

2. Krugman, Judith I. Justman, J., & Wright- 
stone, J. W. Pupil functioning on the Stanford- 
Binet and the Wechsler Intelligence Scale for 
Children. J. consult. Psychol., 1951, 15, 475- 
483. 

3. McNemar, Q. Psychological 
York: Wiley, 1949. 

4. Nale, S. The Childrens-Wechsler and the Bi- 
net on 104 mental defectives at the Polk State 
School. Amer. J. ment. Def., 1951, 56, 419-423. 

5. Pastovic, J. J.. & Guthrie, G. M. Some evi- 
dence on the validity of the WISC. J. consult. 
Psychol., 1951, 15, 385-386. 

6. Peatman, J. G. Descriptive and sampling sta- 
tistics. New York: Harper, 1947. 


7. Seashore, H. G. Differences between Verbal 
and Performance IQs on the Wechsler Intel- 
ligence Scale for Children. J. consult. Psychol., 
1951, 15, 62-67. 

8. Sloan, W., & Schneider, B. A study of the 
Wechsler Intelligence Scale for Children with 
mental defectives. Amer. J. ment. Def., 1951, 
55, 573-575. 

9. Stacey, C. L., & Levin, Janice. Correlation 
analysis of subnormal subjects on the Stanford- 
Binet and Wechsler Intelligence Scale for Chil- 
dren. Amer. J. ment. Def., 1951, 55, 590-597. 

10. Wechsler, D. Wechsler Intelligence Scale for 
Children. New York: Psychological Corp., 
1949, 

11. Weider, A., Noller, P. A., & Schramm, T. A. 
The Wechsler Intelligence Scale for Children 
and the Revised Stanford-Binet. J. consult. 
Psychol., 1951, 15, 330-333. 


statistics. New 





*< 


Journal of Consulting Psychology 
Vol. 17, No. 2, 1953 


Note on Elwood’s Study of IQ Changes 


Quinn McNemar 


Stanford Usniversity 


A recent paper by Elwood [1] purports 
“to examine the validity” of corrections, sug- 
gested by the present writer [2, pp. 172-174], 
to 1Q’s for differences in IQ variability at 
different age levels. She reasons that since 
corrections at age 6 tend to make low I1Q’s 
lower, one would expect those with low (un- 
corrected) IQ’s at age 6 to have lower values 
at age 8, where no corrections are needed. 

Her method involves a comparison of test- 
retest 1Q’s of three groups of children tested 
and retested at about ages 6 and 8, the three 
groups having been selected on the basis of the 
first test as those falling in the three intervals 
50-59, 60-69, and 70-79 on the IQ scale. 
Then on the basis of averages, the “expected” 
drop in 1Q’s at the second testing did not 
occur, hence it is implied that the recom- 
mended corrections are not sound. 

What Elwood failed to consider is the fact 
that a regression upward has neatly counter- 


balanced the “expected” drop in IQ. Indeed, 
the lower variability at age 6, compared to 
that at age 8, is responsible for the failure of 
these cases, so chosen, to show a regressive 
gain. If Elwood wishes to make an unambig- 
uous factual contribution to this problem by 
the longitudinal approach, she should simply 
compare the SD for ail cases tested at circa 
age 6 with the SD for the same cases retested 
two years later. It must, of course, be pre- 
sumed that the cases are unselected, i.e., typi- 
cal of 6-year-olds. 


Received August 27, 1952. 


References 


1. Elwood, Mary I. Changes in Stanford-Binet IQ 
of retarded six-year-olds. J. consult. Psychol., 
1952, 16, 217-219. 

2. McNemar, Q. The revision of the Stanford-Binet 
Scale. Boston: Houghton Mifflin, 1942. 


153 





| 


a amr art 


oe aircanesine le saent AS 





bre 


AND 


ye tg 


Books 


Bowers, Henry. Research in the training of teach- 
ers. Toronto: J. M. Dent & Sons (Canada) Ltd., 
& Macmillan Co. of Canada, 1952. Pp. vii + 
167. $1.90. 


The book consists of a series of fourteen papers 
dealing with some of the many factors that influ- 
ence a student’s success in practice teaching. Each 
paper is a separate unit dealing with one phase of 
the problem. They are carefully organized, stating 
clearly the question, the methods used, the results, 
and the conclusions obtained from the study. The 
concluding summary analyzes the student’s back- 
ground, personality traits, interests and activities, 
social acceptance, capacity for leadership and other 
factors, in relation to their effect on the student’s 
success in practice teaching. The book should be of 
special interest to those who are training our fu- 
ture teachers, and to high-school counselors who are 
advising students in regard to their possible success 
as teachers. One wishes that the study were based 
on success in teaching after several years of work, 
rather than on the more academic situations found 
in practice teaching.—B. M. L. 


Davidson, Henry A. Forensic psychiatry. New 
York: Ronald Press, 1952. Pp. viii + 398. $8.00. 


This book is designed as a psychiatric-legal guide 
for physicians, particularly psychiatrists, who may 
be called upon as expert witnesses. It begins with a 
careful discussion of the McNaghten formula, the 
basis for the determination of criminal responsi- 
bility in this country. It includes description of 
types of examination and reports, and discussion of 
the physician’s role in cases involving personal in- 
jury, annulment of a marriage, custody of children, 
evaluation of testamentary capacity, appraisal of 
sex offenders, alcoholics, juvenile offenders and 
malingerers, and a description of commitment pro- 
cedures and the competency of the mentally ill. The 
situations chosen to illustrate the discussion of mal- 
practice as it applies to the psychiatrist will be of 
interest to psychologists, particularly one in which 
a psychologist was the chief target of criticism. The 
book also includes a discussion of courtroom prac- 
tice. — A.R. 





Note: The reviews were prepared by the Editor 
and the Associate Editors, who may be identified by 
their initials. 


BOOKS 


NEW ff > 


AS TS 


Eissler, Ruth S. et al. The psychoanalytic study of 
the child, Vol. VII. New York: International Uni- 
versities Press, 1952. Pp. 448. $7.50. 


Of 21 chapters in this volume, the first 4 constitute 
a symposium on the Development of Ego and Id, 
held at the Congress of the International Psycho- 
analytical Association at Amsterdam, Holland, in 
1951. This symposium is highly technical, and 
probably of interest only to analysts. Most psycholo- 
gists will probably find the other chapters of greater 
interest, since they are more clinically oriented. The 
majority of them deal with very young children— 
age two to four—and three sections deal with 
anxiety factors related to hospitalization of young 
children. An interesting methodological approach 
in the second of these three discussions is the use of 
a film for the psychoanalytic study of a young 
child. One of the most valuable discussions of this 
volume is a 65-page chapter by Gerald H. J. Pear- 
son, “A Survey of Learning Difficulties in Child- 
ren.” This is both a survey of the psychoanalytic 
literature on the subject, as well as an elaboration 
of the author’s thesis, “The ability to learn is a 
function of the ego and therefore its disorder is 
produced by influences which affect the ego func- 
tions.” Seven major factors causing learning diffi- 
culties are discussed, and abstracts from 37 cases 
are used illustratively.—M. K. 


Eysenck, H. J. The scientific study of personality. 
London: Routledge and Kegan Paul, and New 
York: Macmillan, 1952. Pp. xiii + 320. $4.50. 


This volume, in a sense a sequel to the Dimen- 
sions of Personality (1947), reports more of Ey- 
senck’s original and ingenious experiments. His use 
of objective tests in the appraisal of personality, in 
contrast to questionnaires and projective methods, 
deserves wider attention in the United States. The 
studies lead to one main proposition, of the exist- 
ence of three dimensions of personality which are 
relatively independent: introversion-extroversion, 
neuroticism, and psychoticism. In this frame of ref- 
erence “. . . the question: ‘Is this person psychotic or 
neurotic?’ becomes as unreasonable as the question: 
‘Is this person intelligent or tall?’” (p. 285). The 
experiments which support these hypotheses are, in 
the main, neatly designed and analyzed, and yield 
sharp tests of propositions deduced from theory. 
Some of them, however, seem to suffer from the lack 
of cross validation on independent groups. Such a 
statically cross-sectional approach, however ably ap-- 


154 





i AA. tess cage 





New Books and Tests 155 


plied, leads almost inevitably to a hereditarian posi- 
tion. Eysenck espouses “constitution” as the cause 
of individual differences in personality, and gener- 
ally belittles psychogenic hypotheses based on clini- 
cal evidence. One may hope that someday he will 
undertake the experimental investigation of learn- 
ing, and apply his considerable talents to the study 
of change as well as to the study of status. —L.F.S. 


Fiedler, Miriam F. Deaf children in a hearing 
world. New York: Ronald Press, 1952. Pp. viii 
+ 320. $5.00. 


The notion that handicapped children should be 
reared and educated in close association with norm- 
al children is widely accepted but al) too rarely 
practiced. This semipopular book, for parents and 
teachers as much as for psychologists and other 
specialists, reports experiences arising from an ex- 
periment in the normal education of deaf and hard- 
of-hearing children. The chapters contain an ac- 
count of the experiment, case studies of children, 
and discussions of parents’ problems based on 
transcribed recordings of discussion groups.—L.F.S. 


Gilbert, Jeanne G. Understanding old age. New 
York: Ronald Press, 1952. Pp. ix + 422. $5.00. 


This is a welcome addition to the growing litera- 
ture on senescence, by a pioneer, productive, well- 
qualified writer. As an organized informational 
compendium it will be helpful both to present work- 
ers in the field and to those who may not yet be 
versed in the field’s importance. It leaves unsolved 
many issues which becloud the critical evaluation 
of the many studies here collected. Is gerontology 
“the scientific study of the phenomena of aging” 
(opening sentence, p. 3)? Is aging a continuing 
anabolic-catabolic reciprocation from conception to 
death, or is it some later or final segment of growth 
and decline? Which of its instar stages is really 
old age? And do we “age” at equal rates at all 
life-span periods, or do our organ-systems regress 
unequally during involution as they progressed in 
evolution? Who is old, when, why, and by what 
individual differences? Is longevity positively cor- 
related with favorable aptitudes, or do the good die 
young? What constitutes a standard sample for ex- 
perimental and control studies? Does “old age” 
comprise a “clinical” syndrome? If so, is it primar- 
ily psychosomatic or somatopsychic? Until such 
issues are resolved one may still endorse the hack- 
neyed quip that “oid age is any age older than 
mine.” Dr. Gilbert recognizes these problems and 
offers much material toward their solution. Adding 
to our knowledge about old age increases our un- 
derstanding of it. — E. A.D. 


Hilgard, Ernest R., Kubie, Lawrence S., & Pum- 
pian-Mindlin, E. Psychoanalysis as science. Stan- 


ford, Calif: Stanford Univer. Press, 1952. Pp. x 
+ 174. $4.25. 


The title of this book is eye-catching, but its 
content turns out to be something of a disappoint- 
ment. It is a clear and able summary of much that 
has been said about psychoanalysis, but contains no 
strikingly new evidence. Of five lectures given at 
California Institute of Technology in 1950, the first 
two, by Hilgard, summarize the findings of experi- 
mental psychology which are relevant to psycho- 
analytic dynamics and therapy. Two lectures by 
Kubie describe the techniques of psychoanalysis, 
with persuasive but nonexperimental illustrative 
anecdotes, and cite the main premises of psycho 
analysis. In the final lecture, Pumpian-Mindlin 
traces the historical development of psychoanalysis 
and its relationships to the biological and social 
sciences.—L.F.S. 


Krieg, Wendell J. L. Functional neuroanatomy 
(2nd Ed.). Philadelphia: Blakiston, 1953. Pp. 
xviii + 659. $9.00. 


This magnificent text and book of charts de- 
serves to be brought to the attention of all psycholo- 
gists. It is outstanding in two respects. First, it is 
more genuinely functional than most books on 
neuroanatomy with clear descriptions of what the 
neural structures do. Second, many of its 274 fig- 
ures, and its 88 pages of atlas, have a remarkable 
three-dimensional quality which facilitates visualiz- 
ing the nervous system “in the round.” While most 
effectively used in connection with a laboratory 
course, the text is a suitable adjunct for any course 
in physiological psychology.—L. F. S. 


Reik, Theodor. The secret self. New York: Farrar, 
Straus, & Young, 1952. Pp. 329. $3.50. 


In his fiction-like style Reik opens with a conver- 
sation he had with Freud, and closes with satisfy- 
ing insights he experienced one dawn as he ended 
an insomnious night of personal struggle with deep 
sleep. Such is the tone of another volume devoted 
to the thesis that psychoanalysis is a great adven- 
ture in inner experience and not a routine with 
mental gadgets manipulated by skilled technicians 
of the unconscious. To be a creative experience for 
the patient it must first be such for the analyst. And 
so it is for Reik. He demonstrates this in twenty 
chapters of subjective notes linked with case ma- 
terial and classical literature. These are ‘“impres- 
sions obtained in everyday life and in analytic 
sessions [which] awakened echoes of other inner 
experiences shaped by a great writer.” Reik adds a 
contribution indeed to our contemporary scientific 
emphasis in clinical psychology. His voice reminds 
us that the direction of human adjustment is at best 
a creative process, an aspect which needs emphasis 
in our training program. Science and this subjective 
emphasis together may teach us that the analyst can 








156 


“listen to his patient and to his own associations,” 
but this art is a highly finished one requiring much 
nurture and difficult to validate. — F. McK. 


Shaffer, G. Wilson, & Lazarus, Richard S. Funda- 
mental concepts in clinical psychology. New York: 
McGraw-Hill, 1952. Pp. xi + 540. $6.00. 


This unusually comprehensive text on the theory 
and practice of clinical psychology is written clear- 
ly and well. While suitable for advanced under- 
graduates and beginning graduate students, more 
advanced students may read it with profit. At many 
points, the mature practicing clinician may wish for 
a more penetrating discussion, but the book will 
offer him an integrated view of the field that is not 
easily obtainable elsewhere. The treatment ranges 
well beyond the narrow field of clinical psychology 
itself to supply supplementary background material 
from all of psychology. Chapters on scientific 
methodology and on the nature of personality illus- 
trate this breadth. While this comprehensive ap- 
proach inevitably necessitates some superficiality of 
treatment at times, the net result remains at a high 
level, and the book should make an excellent text. 
— W. A. H. 


Wechsler, David. The range of human capacities. 
Baltimore: Williams & Wilkins, 1952. Pp. ix + 
190. $4.00. 


The changes in this revision consist primarily in 
the addition of two chapters, on span of life as a 
human capacity, and on range in productive opera- 
tions, together with increased emphasis on the real- 
ity of decreased capacity with aging. The discus- 
sion is based on the concept of the validity of a 
range ratio as a measure of human capacity. The 
ratio settled upon is that between the 2nd and 
999th individual in every thousand. After examina- 
tion of a quantity of data, including 26 new sets of 
measurements, the author believes that “human vari- 
ability, when compared to that of other phenomena 
in nature is extremely limited, and the differences 
which separate human beings from one another 
with respect to whatsoever trait or ability we may 
wish to compare, are far smaller than is ordinarily 
supposed.” The other phenomena in nature to which 
he refers are various physical constants; no bio- 
logical data other than those on man are adduced. 
Omitting special cases, the total range ratios pre- 
sented fall between 1.16:1 and 2.93:1. Their distri- 
bution is multimodal, indicating the following 
groupings: linear body measures (ave. 1.30:1), 
metabolic rates (1.39:1), body circumference 
(1.52:1), physiologic function (2.07:1), motor co- 
ordination and speed of movement (2.23:1), meas- 
ures of body weight (2.33:1), and perceptual and 
intellectual abilities (2.58:1). — A.R. 


New Books and Tests 


Tests 


Bellak, Leopold, & Bellak, Sonya S. Supplement to 
the Children’s Apperception Test (CAT-S). Ages 
3-10. Individual test. 10 pictures, with manual, 
pp. 8 ($6.00). New York (Box 42, Gracie Sta- 
tion): C. P. S. Co., 1952. 


The ten supplementary pictures for the CAT are 
designed for use with children whose complaint 
problems or previous responses suggest particular 
areas of investigation. The ten situations, enacted 
by human-like animals comparable to those in the 
CAT, represent peer play, schoolroom, playing 
house, mothering, injury, competition, body image, 
visit to physician, bathroom scene, and pregnant 
mother. The manual recommends several methods 
for using the pictures, and describes typical respons- 
es, with limited normative data on 40 six- and 
seven-year-olds. These ingenious materials will be 
a useful tool for research and clinical appraisal. 
—L. F. S. 


Jay, Edith Sherman. A book about me. Preschool- 
gr. 1, ages 4-7. 1 form. Untimed. Booklets ($2.40 
per 10); analysis sheets (55¢ per 25); manual, 
pp. 32 (25¢); specimen set (50¢). Chicago: Sci- 
ence Research Associates, 1952. 


A Book About Me provides material for a survey 
of children’s backgrounds, maturity, activities, and 
interests, at the prekindergarten to first-grade 
level. Children mark pictures in an attractive book 
according to standard instructions for a formal 
survey, or according to more flexible procedures for 
other uses which may be planned by the teacher or 
investigator. The manual describes the survey pro- 
cedures and the auxiliary analysis sheet, and sug- 
gests further uses of the material in school activity 
programs. — L.F.S. 


Books Received 

Frankel, George W. Let’s hear it. New York: 
Stratford House, 1952. Pp. 63. $1.00. 

Garrett, James F. (Ed.) Psychological aspects of 
physical disability. Federal Security Agency, Of- 
fice of Vocational Rehabilitation, Rehabilitation 
Service Series No. 210. Washington: U.S. Gov- 
ernment Printing Office, 1952. Pp. vii + 195. 
45¢. 

Rickman, John (Ed.) On the bringing up of chil- 
dren. (2nd Ed.) New York: Robert Brunner, 
1952. Pp. xxii + 243. $3.00. 

Steiner, Lee R. A practical guide for troubled 
people. New York: Greenberg, 1952. Pp. 209. 
$3.50. 

Thompson, Charles B., & Sill, Alfreda P. Our 
common neuroses. New York: Exposition Press, 
1952. Pp. xxxii + 210. $3.50. 











ee ell 


en 


a 


ps 
se : 





YEAR 


1937 
1938 
1939 
1940 
1941 
1942 
1943 
1944 
1945 
1946 
1947 
1948 
1949 
1950 
1951 
1952 
1953 








JOURNAL ‘OF CONSULTING PSYCHOLOGY 
a AVAILABLE BACK ISSUES 





Ta oe listed below are available for sale 


PRICE PER PRICE PER 


VOLUME in at ON Be e. pad NUMBER VOLUME 
1 1 - + - 6 $.60 $3.00 
2 1 : 3 < 5 6 $.60 $3.00 
3 1 2 3 + 5 6 $.60 $3.00 
+ 1 2 - a - - $.60 $3.00 
5 1 2 3 a 5 6 $.60 $3.00 
6 1 - 3 - - 6 $.60 $1.80 
7 - 2 3 + 5 6 $.60 $3.00 
8 1 ~ 3 t 5 6 $.60 $3.00 
- = - ~ + 5 6 $.€0 $1.80 

10 1 2 3 + 5 - $.60 $3.00 
ll - - - 4 5 6 $.60 $1.80 
12 1 2 3 + . 6 $1.00 $5.00 
13 1 2 3 4+ 5 6 $1.00 $5.00 
14 1 2 3 a 5 6 $1.00 $5.00 
15 1 2 3 > 5 6 $1.00 $5.00 
16 1 2 3 4 5 6 $1.00 $5.00 
17 By subscription, $7.00 $1.25 $7.01 


Table based on inventory of January 1, 1953 


a — 


The Journal of Consulting se adit ete. is of particular interest to clinical psy- 
chologists, psychiatrists, school psychologists, and persons wao are engaged in 
counseling and guidance work. 


From 1937 through 1947 the price is $.60 per issue and $3.00 per volume. From 
1943 through 1952 the price is $1.06 per issue and $5.00 per volume. Beginning 
in 1953, the price is $1.25 per issue and $7.00 per volume. For foreign orders, 
$.25 per volume should be added. Address orders to: 


AMERICAN PSYCHOLOGICAL ASSOCIATION 
ee 1333 Sixteenth Street N. W. 
~ Washington 6, D. C. 

















OL A 





A Modern Evaluation 
of Sensory Psychology 





The HUMAN SENSES 


by FRANK A. GELDARD 
Professor of Psycholog, versity of Virginia 


“The Human Senses” is the modern, comprehensive descrip- 
tion of man’s senses from a psychophysiological point of 
view. It pinpoints the place sensation in psychclogical 
and physiological knowledge, and unfolds the drama of exch 
of the senses, stressing their importance in the total be- 
havior picture. 


This broad, unified account of the mainsprings of human 
action belongs in every psychologist’s library. The work. is 
valuable for its detailed chapters on the often neglecved 
cutaneous senses. Psychologists will welcome the full dis- 
cussion of the phenomena of cach of the senses with the 
thorough treatment of both the physies of stimuli and the 
physiology of the sense org) )f special interest, too, is 
the emphasis on the importance cf sensation in modern 
engineering psychology and recent human engineering de- 
velopments. 


You will value “The Human Senses’ for its unity and range. 
Dr. Geldard writes with the authority of twenty-five years 
of active work in the field. Clearly written and copiously 
illustrated, “The Human Senses’ may well prove to be the 
definitive work on this vital phase of psychology. 


February, 1953 365 pages 104 ill. $5.00 
SEND NOW FOR AN 0N-APPROVAL COPY 








)- 4th Ave, New York 16, N.Y. 











